Technical Breakdowns

A collection of technical breakdowns and explainers about FM RF signals.

Nyquist Shannon Sampling

FM RF Signals

Quality of tape media is in direct correlation to bandwidth used and signal to noise ratio (SNR) lower bandwidth the lower potential SNR etc.

Thanks to Novgood for the explanation

In the frequency domain, you have to think in sine waves - everything can be expressed as some (simple or super complex) combination of different sine waves .. Imagine you have 2 patches with different grey levels next to each other in a scan line.

Each one has an "exact" FM frequency (or a perfect sine wave) to represent it, that means a constant grey level is ideally "infinitely" narrow-band (a single frequency). In this ideal case you'll get a signal discontinuity at the transition between the 2 sine waves where one wave "ends" and the next "starts".

Even if you match the phase perfectly so that there's no jump in amplitude at the transition point, you'll still have a "kink" (that is, a jump in the derivative). Mathematically, you need "infinite" bandwidth to reproduce this perfect discontinuity with a superposition of different sine waves (that's what Fourier/spectral analysis does).

In reality, this transition between the 2 frequencies takes some finite amount of time (the edge of a PWM or clock signal is not perfectly vertical) - or translated to FM, the sine waveform gets bent to close the discontinuity gap between the 2 frequencies. In this transition region it looks more like the edge of a pulse than a sine/cosine function, which requires a whole bunch of superimposed sine waves with different frequencies to produce.

In fact, it takes the whole available bandwidth to approximate a sharp transition, and the "sharpness" of it reflects merely how much bandwidth was available.. Don't know if this makes it clearer or adds to the confusion - just think in sine waves ...

FM Deviation Range

The FM deviation range (4.2-5.4MHz for Video8) refers to static signals, like pure grey or pure white. Whenever you have quick transitions between brightness levels, i.e. the FM frequency has to change rapidly from one value to another, it creates sidebands (higher and lower frequencies), which go far beyond the FM deviation range and the required bandwidth to capture these transitions only depends on how quick the transition is in the time domain - that's the reason Hi8 captures finer features, because it supports faster FM transitions due to the higher overall bandwidth (not just the static deviation range).

It's the classic time-bandwidth relationship: the shorter the pulse, the more bandwidth you need to capture it. So the only practical way is to look at spectrograms of different RF samples and see where either the tape or the recorder has capped the usable bandwidth.

That's how I came up with 8MHz for Video8 and 10MHz for Hi8 (-> eyeballing). Of course you can extend the margins based on gut feeling, but digital band-limiting filters can be made very narrow, so there shouldn't be any issue with artefacts at the HF edge.

Ghosting & Ringing

Yes, the jump causes ringing, even in my "ideal" example, because the transition happens within 1 sample and not mathematically instantaneously (that would require infinite bandwidth or an infinitely long signal). Limiting the bandwidth further doesn't really change the ringing, it primarily makes the transition less sharp. There are special filter shapes to minimize ringing (which I didn't use here).

In case anybody wonders, I added a little bit of noise to the FM signal, which is visible in the demodulated signal and seems to be gone in the filtered signal. That's because the noise is broadband (present at all frequencies) and the filter removes the majority of frequencies contributing to the signal and therefore also removes the majority of the noise.. (edited)

ifb - You sometimes hear that square waves need infinite bandwidth. This is why. Transitions aren't instantaneous.

Visual Example

Top: Modulated / Middle: Peaks / Bottom: De-Modulated

Left side: 2 distinct sine waves with a mathematically "perfect" transition (within a single sample) - that's the bandwidth required to represent this transition.

Right: if the bandwidth is limited around each signal, the transition is forced to be more mixed and gradual

Time Base Correction

Software Decoding

Detect sync/blank level from reference
Find 50% point between sync tip and 0 ire (black level)
Detect Vertical Sync Pulse to define sync

it starts out by trying to detect an approximate sync/blank level (using the level from the format specs as a starting point/reference), then looks for where the half-way point between sync tip/0 ire are crossed, then tries to determine what type each detected pulse is.

Then it tries to first detect where the vertical sync is by looking for the vertical sync pulse sequence (depending on format).

For vhs also, if --fallback_vsync is specified, it will also do a more primitive search for just long pulses if that fails. it will also guess something based on earlier vsync if it exists and none is found.

When that is done it will try to align the horizontal sync pulses it found to expected line starts using the falling edge (i think). It goes through and tries to refine the starting point by approximating where the half-way point of the falling edge is between the level of the back porch and vsync is.

vhs-decode has code that will do the same refinement but using the rising (right) edge instead (can be disabled with --drh option).

Using the right edge is usually better as the left one tends to be more affected by overshoot, especially on dubbed recordings but it can cause issues if hsync length is abnormal since it won't line up properly.

the refine code sometimes still fails a bit as well and doesn't always manage to detect that it failed so I recently added an option to skip it in those cases too (which helps in those cases but results in a bit more jitter otherwise).

For laserdisc there is additional code that also aligns using burst on NTSC and pilot signal on PAL but using burst is a bit trickier on tape formats (and won't work on dubbed tapes due to playback AFC messing with it) and pilot signal only exists on certain formats (and only for PAL) so not added as of now.

Decode Process

RF -> FFT -> RF Filter -> iFFT -> Demod -> FFT -> VideoFilter(Deemphasis) -> iFFT -> Sync Detection(find fields) -> TBC

https://discord.com/channels/665557267189334046/665834485975351307/1147712663489630268

Novagood non-linear de-emphasis

https://discord.com/channels/665557267189334046/665834485975351307/1151941027809206363

Instead of pseudo-code, here is a more detailed documentation of my (current) nonlinear de-emphasis function. The signals at each steps are uniquely numbered so it should be easier to keep track.

Inputs: FM-demodulated luma (time domain array, real) Static de-emph filter (frequency domain, complex) HPF used for expander (frequency domain, complex)

I use FFT filters, so the filters are defined as complex transmission functions (magnitude/phase) in the frequency domain and are generated externally before calling this function (e.g. by evaluating the IIR coefficients at every frequency step). The de-emphasis filter is a shelf filter and the HPF is a standard butterworth (or similar) filter with adjustable order and corner frequency. The filter generation routine runs only once when parameters are changed and the filters are buffered and re-used for every call of the de-emph function (i.e. for every field/frame/chunk).

Main de-emphasis: Take FFT of demodulated luma Multiply (4) with (2); result is de-emph'd luma in the frequency domain Take real-valued IFFT of (5), which is the de-emph'd luma in the time domain

Nonlinear "sub-de-emphasis": Multiply (5) with (3); result is a high-pass filtered version of the de-emph'd luma, still in the frequency domain Take complex IFFT of (7) to get the time-domain version; the complex IFFT recovers the imaginary part of the time-domain signal like the Hilbert transform does

Up to this step you could also use IIR filters (and a Hilbert transform for step 8) instead of FFT filters as long as you end up with two time-domain signals, the statically de-emph'd luma (6, real) and a high-pass filtered version of it (8, complex). The actual expander operates in the time domain using these two signals. (edited) [19:03] Take the magnitude of (8) which is the instantaneous signal level for the expander and apply some scaling; I use a linear (multiply by some number) and a nonlinear (take to some power) scaling parameter Take the real part of (8) which is the actual HF luma signal and multiply it with (9); as a result, the HF luma signal gets (nonlinearly) dampened wherever its instantaneous level is high Subtract (10) from (6); that's the second (nonlinear) de-emphasis step

Output: (11)

There are certainly more elegant ways to implement this, especially making the parameter choice more convenient. Anyway, the basic idea is that the second (dynamic) de-emphasis step reduces the HF part of the luma signal (just like the first static step), but now the amount of reduction depends on the instantaneous level of this same HF part (signal 9). Imagine in step 10 you just take the high-pass filtered luma and subtract it as-is from the (statically) de-emph'd luma (signal 6) - that's the same as simply low-pass filtering (6) with the inverse of (3) and you lose all HF detail. By level-dependent scaling of (10) you can control how much of the HF part you subtract, i.e. most of it when the HF-level is low (e.g. noise) and none of it when the HF-level is high (e.g. bright/dark edge). The nonlinear scaling approach in step 9 is a simple empirical approximation which seems to work reasonably well. This can (and should) be replaced if someone comes across the actual specs of the expander circuits used for the different tape formats (including the shape of filter 3) and a smart way to implement it digitally...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Technical Breakdowns