Visuals have been at the core of the Audio Industry

First things first, this is a Work in Progress article, meaning I’ll most likely keep it inconclusive and keep adding to it until I feel it’s complete. That is when this paragraph will get removed. I didn’t want to hold back on this idea reaching people, hence, publishing it as I iterate over it. Drop me a message using the Contact page if you have something to add.

Waveforms — how we so easily take them for granted, as if they were always around. “Apply a fade” sounds trivial, doesn’t it? “Move that part of the audio a bit to the left”, “This transient can make this section pop out a bit”, and other similar phrases are part of our day to day lives.

It’s very important to remember that in order to apply fades, split, or move things around, the first step is to have the idea of the waveform itself. Visualisation of sound could have gone in many directions, sonic-pun intended. But it landed in the way we see today. [more thoughts on waveform to be added here]

Spectrogram is another piece of art: by art I don’t mean that some audio file happens to look cool in its spectrogram view, but the process & idea of visualising sound as an image that uses intensity of colors to determine loudness and that eventually enabled us to edit frequency regions with “lasso” tools! [more on Spectrogram to be added]

The next bit thing comes down to how we begin with these zeroth-level steps. MFCCs, and many other ways to interpret audio have already opened up new doors. [a few more example of visualisation to be added here]

As a food for thought, I’ve been brainstorming over the concept of sound & audio. We often refer to a synth sound as “sound”. You might want to call it “audio”. Imo, at least as of 10:29pm here, I prefer to use the word “data” for anything that’s created inside the computer.

For me, the physical movement of molecules is “vibrations”. Those vibrations reaching our ears and happening to be above 20Hz is “sound”, the perceived experience. Anything below 20Hz is a “feeling”, for the lack of a better word as I write this. If I record that vibration into analog or digital media, that ‘thing’ is “audio” which in essence is “data”.

Going in the other direction, if something was generated in the analog or digital media itself, it’s just “data”, until it reaches the speakers which is when it becomes “sound” and our realisation that the “data” was perceived by our brain as “sound”, makes it “audio”. Makes sense? [Will write more on using this to think of visualisation]