Upscaling and what it means?


#1

So I’m unfamiliar with the practice of upscaling, I’m squarely in the camp of plug a thing in and it should sound good…then start researching and see what else can be done to mod/tweak/upgrade to put my personal spin on a thing.

So my question is what is upscaling and how do I go about getting a good understanding of it. Can it be as simple as changing the settings in Midi on Macs? Or do I need a specific piece of equipment?

This whole question is being brought up due to @Torq and his new Chord toy :smile:

Edit: I’ll be researching it also, and share what I find.


#2

Well, to me upscaling is a process by which a video signal is scaled up to display on higher resolution displays compared to the original signal. For example, a 720P video signal must be up sampled to fit on a 1080p display. To do this, they use an algorithm to determine what color missing pixels need to be. If they need to fit a new pixel in between two others, and those 2 pixels are both red, then the new pixel should be red. That’s easy. But what if one were Blue, and the other yellow? Should it fill with blue, yellow, or some blend of the two? Clearly, this is complicated, and can lead to visual artifacts if done poorly.

With audio, I think it is commonly referred to upsampling. The same principal applies though. You’re creating new information based off of what has come before and after in an audio stream. While it can lead to a higher resolution sound, you can also lead to audio artifacts.


#3

Upsampling…oops that is what I meant.

What is the positive and negatives to this. Besides the obvious poor implementation of some coding and adding bits that don’t belong?

For instance on Midi if I put the setting higher than 48Hz what does that actually do? Is this considered upsampling?

Yes I’m asking silly questions on purpose :wink:


#4

It stops me from asking them.:grin:. This is a great topic to discuss, thanks for putting it out there. It’s fascinating what the boffins can do now. It’s really clever. I was vaguely aware of upscaling, but, like many other I suppose, I don’t have a great understanding beyond the basic principles of how it works.


#5

For audio purposes, as @ProfFalkin has said, we usually refer to “upscaling” as “upsampling” - and sometimes as “oversampling”. These terms are often used interchangeably, although a more proper application of them would be that “oversampling” means sampling at a higher rate than Nyquist for the source signal and “upsampling” means performing a sample-rate-conversion (SRC) from one already-sampled source to a higher rate.

While both “oversampling” and “upsampling” work to solve a similar problem, specifically to make it easier to implement the necessary filters for sampling (brick wall/anti-imaging) and replay (reconstruction), the first is applied at capture time in the ADC and the second by the DAC (though there are some DAC architectures which also do internal “oversampling” earlier in their conversion steps).

If you sample a normal audio signal at 44.1 kHz, which is the CD standard, you need a brick-wall filter that absolutely ensures no audio information is passed to the ADC with a frequency higher than 22,050 Hz (otherwise you’ll get images - i.e. false data - lower in the audio band). If you want a flat response from 20 Hz to 20 kHz, then that means you have to attenuate the input from 0 dBFS to -96 dBFS over just 2,050 Hz. If you oversample the input at, say, 176.4 kHz, for the same audio content, your filter now simply has to go from 0 dBFS to -96 dBFS over a span of 66.1 kHz (88.2. kHz - 20 kHz). Which is a much shallower curve and easier (and cheaper) to engineer reliably.

Remember that the input filter operates in the analog domain as it must occur prior to the signal reaching the ADC!

There’s a decent overview of it, with illustrations and examples, here. And I’m happy to get into a detailed discussion on specific aspects of it as needed/desired.


It is worth noting that many DACs, and in particular delta-sigma designs, already do their own upsampling - whether you want them to or not (though some allow you to choose if it happens, and sometimes by how much)!

Schiit’s entire multi-bit line over-samples (for Yggdrasil it is to 8x … or 8 fs - where “fs” is the base sample rate, so 44.1 kHz input gets upsampled to 352.8 kHz), Chord’s DACs do an even more extrema upsampling, in two stages, for example with DAVE first to 16 fs and then by a further 256 fs.

These DACs also use proprietary filters (“Super Combo Burrito” for Schiit’s line, “Watts Transient-Aligned” for Chord’s for example). A typical filter, built into a DAC chip, might use 256 “taps”. When you see references to “tap length” or “filter length”, each “tap” is a specific conversion coefficient, and the longer the filter the more likely you are to get to conversion coefficients of zero. Higher sample rates require longer filters (more taps) to do this. There is no benefit to having a million-tap filter on raw 44.1 kHz (non-upsampled) content, as the vast majority of the taps will have a zero coefficient.


From a less theoretical effect, let’s talk about actual application and software - per the questions in the original post.

Upsampling can, indeed, be done in software. In fact for both macOS and Windows, if you set the output rate to your audio device (e.g. via the Audio Midi Utility on macOS) to a higher rate than the source material being played, then the OS will upsample the content on the fly.

This is generally NOT a desirable thing as you have no control over how this upsampling is done, and there are multiple approaches, filters and levels of precision that can be applied, which have different implications and potential artifacts - the built-in OS upsampling generally isn’t as good as dedicated software.

Of note, here, is what happens by default in Android-based systems. Android’s standard audio-stack assumes a sample rate of 48 kHz. Any source material not at a multiple of 48 kHz undergoes sample-rate-conversion. For example, standard streaming content, CD content, and most compressed audio will be resampled from 44.1 kHz to 48 kHz. This is a non-integer conversion, which makes the math and precision much more involved (and critical) than a simple powers-of-two conversion (e.g. 48 kHz -> 96 kHz).

More precise conversions and filters (e.g. an ideal sinc filter) are more demanding in terms of power (batter) and CPU, than is ideal for a cellphone, and as a result those sample-rate-conversion implementations are optimized for power rather than quality. Thus we want to avoid that conversion in the device if we can, and this is one reason why Android-based DAPs sometimes tout having a custom-audio stack to bypass this process.

Going further …

On a Mac or a PC, there are myriad ways to do upsampling in software. Many high-end music-player applications allow you to enable upsampling, and they generally implement much more sophisticated schemes than you’ll find built into the OS.

Audirvana+, for example, allows you not only to specific many of the details of how the upsampling is performed, and to what degree, but even allows you to choose between two different upsampling engines, “SoX” (open source) and “iZotope”.

If you want more control, and even more sophisticated approaches, including control over things like filter type, tap-length, noise-shaping (required by all 1-bit, delta-sigma and DSD conversions), then you want to look at "“HQPlayer”.

Most conversions, at sane upsampling rates, can be done easily on the fly. However, extreme upsampling and the resulting long, complex, filters and noise-shapers you want to apply there, are VERY processing-power intensive. HQPlayer, for example, converting 44.1 kHz PCM to DSD512, and then using the highest fidelity poly-sinc filter and high-order noise-shaping, will required a dedicated multi-core computer (or significant GPU compute capacity) to work, and even then can have significant startup-latency.


Hardware up-samplers/filters originated when the required processing was more than was easily accommodated on reasonably priced general purpose hardware/computers. Most of that is now handled by software in the real-world (either on the computer, on a basic DSP chip in the DAC).

Extreme hardware up-sampling, and in particular the necessary filtering and noise-shaping you must apply to get the benefits of it, still requires serious processing power (as per the HQPlayer example above). This is where things like Chord’s M-Scalers come in … as they use a massively-parallel DSP approach to do both the upsampling and then the complex filtering and noise-shaping over very long tap length filters.

The Chord Hugo M-Scaler, which is to my knowledge the most advanced and extreme hardware audio upsampled/filter available, uses an FPGA that provides 740 DSP cores, and utilizes 528 of those in parallel to upsample to 4096 fs before applying a 1,015,808 tap implementation of Rob Watt’s “WTA” filter, and reducing the final output rate to something the DAC can handle (upto 768 kHz in the case of Chord’s newer DACs). And even with such powerful hardware on tap, this incurs about a 1.4 second latency. And the result of this is effectively an ideal implementation of a sinc-filter that optimally recovers the originally sampled data for material up to 44.1 kHz and 16-bits, and gets closer than anything else I’m aware of for higher rates and bit-depths.


So, short version - you can experiment with upsampling (and filtering) in software. Doing so to a high degree requires special software and powerful hardware. And otherwise you can look at various hardware options, the highest-spec of which is, today, the M-Scaler. From there the rubber-meets-the-road as you start to consider the audible effects of this processing vs. what it means in terms of math, theory and the demands/easements it enables on the actual hardware implementation.


#6

So to summarize, the reason one might use an M-Scaler is that one either 1) has a NOS (non-oversampling) DAC or 2) has an oversampling DAC but believes that the M-Scaler can oversample better ?


#7

Pretty much.

How applicable the M-Scaler’s, or software like HQPlayer’s, upsampling is to a given DAC is also influenced by how said DAC treats its input. Most oversampling DACs (which is most DACs) have a fixed maximum level of oversampling they’ll apply - beyond which higher resolution input isn’t oversampled.

For DACs that accept input at rates that defeat their internal oversampling (e.g. an original Schiit Bifrost doesn’t oversample content fed to it at 176.4 or 192 kHz), there is, theoretically, more of a benefit than to one that can oversample further (e.g. Yggdrasil accepts 192 kHz input, but oversamples internally to 384 kHz).

The filter and noise shaping should have more of an audible effect than the upsampling part. The ability to apply closer-to-ideal filtering is just dependent on higher sampling rates. And it is the filtering and noise-shaping that really chews up the processing time. Something you can easily experiment with in the trial version of HQPlayer for example - compare the coarser filters and lower-order noise-shapers on any level of upsampled content, and you’ll quickly see that.


#8

Thinking about the context of this some more last night …

It’s probably worth pointing out (even though I would hope it is largely self-evident) that, absent wanting to alter the performance of a high-quality true-NOS DAC (such as the Holo Audio or Metrum units), upsampling/filtering is mostly in the realm of “things you tweak once you have everything else where you want it”.

For example, high-quality EQ will have a much more pronounced (and useful) effect on one’s listening than upsampling - and one that you can always ensure is beneficial to you, as you’re able to apply it “to taste”.

Buying a better transducer is the next most prominent change you can make, followed by amps and DACs.

Indeed, if we just focus on DACs as a case-in-point, just sitting here with the M-Scaler and three of Chord’s DACs (DAVE, Hugo 2, Qutest) that can take full advantage of it, I would say that while I do hear definite differences, all of which so far I would classify as improvements when applying the M-Scaler to the chain, what I do NOT find is that the addition of the M-Scaler alters the performance ranking of those units.

In other words, the M-Scaler -> Hugo 2 or M-Scaler -> Qutest chain does not, for me, result in an across-the-board better end result than using DAVE on its own. Which would mean that if I was building my system again, I would still want to get to the point where I’d bought DAVE, and gotten the rest of my chain optimized, before I bought the M-Scaler.

At least with software-based approaches the investment is far less (<$200 for the best software way to do this sort of thing I know of). Although to take that to its ultimate capability you’ll still be putting a couple of thousands dollars into the necessary hardware to run the software at that level reliably.

Anyway, long story-short … there’s what upsampling is useful for as applied internally in almost all DACs, and the number of rather tricky problems it helps to address there vs. extreme upsampling and special filtering beyond that. The former is high-value, low-cost, the latter is higher-cost and the value is heavily dependent on a number of other factors.


#9

The joys of this hobby are that there are sooo many options, and anyone person will never have the same thing as another. Allowing for these types of knowledge passing and discussion =) I think that Upsampling is just another cool thing that is an option in the chain of many other tweaks/mods you can do. I also agree it is the pursuit of ones own preference to perfection that allows for these types of discussions and hobby. I started researching this and knew that others had way more knowledge than I do, and majority of the articles I found were too biased in ones own opinion for me to take overly seriously. Hence me posting it here. Thank you for taking the time to post valuable information, and passing down some knowledge!