Wava - A .wav file compare and repair utility.



News

- 19 June, 2005: added reference to wavm project as it relates to comparing two similar wav files made from analog sources.

- Aug 25, 2004: 0.2.2 is released. It incorporates the histogram functions from the first step. Reading one file allows statisitics to be done. Reading two files enables the two way compare, and reading three files will permit a three way compare. A simple help function is also added. Still a stand alone file requiring no dll or other support files.

- Aug 19, 2004: 0.1.2 is released. Two way compare is upgraded to allow for missing samples to be located and allow the file comparison to continue. 3-way compare is still a sample for sample comparison.

Background

There are many applications to process and modify sound files, but what about trying to restore or find the original source.

This project (project page) is about discovering and fixing bit errors in sound files. Note that such small errors are probably not even noticed in listening. While not an everyday “fun” application there may be instances where wav files might get damaged and restoring them to at the byte and bit level is required. Beyond fixing bit errors, things get more interesting towards working with files created from analog sources. The work at this point becomes much more complicated and I will have to wait to see how things go.

One example where this utility is useful is comparing wav files taken from a CD with different applications.

In one case I compared a wav file extracted with Exact Audio Copy with a file extracted (lossless mode) with Windows Media Player Version 9 and converted to a wav file with mplayer. The trailing 486838 bytes in a 41905628 byte file were lost with the Media Player. For practical purposes, neglible sound information was dropped by the Media Player. No big deal, but if I had computed a checksum and compared to a friend's checksum I would wonder about the difference.

Related projects by this authot: wavb – utility for generating test tone wav files (with voice labels inidcating frequency), wavc for hearing test, and wavm for averaging two similar wav files.

Examples

Two way comparison also includes time from start values as well as the sample values themselves.




Illustration 1Two Way Compare Screen shot




At this point, a three-way compare function is available.


Illustration 1Example of Wava File Compare




Work to be done at this stage includes:

Goals:

The development steps for this project include:

  1. Setup of a development environment and development of basic functions. This is a learning part where reading a wav file and using the GUI interface. The FLTK library was chosen as it functions on both Windows and Linux platforms. This part is done. The sample program generates a histogram of the sample values. Useful for helping determine the sound quality of the wav file.

  2. Create some test wav files for spotting bit errors and missing samples. DONE.

  3. Develop an application to compare two wav file looking for bit errors and missing samples.

  4. Develop an application to compare and repair based on three slightly corrupted wav files.

  5. Look into the possibility to compare and repair wav files generated by analog to digital conversion. Note that this is complex since considerations include: interpolating between samples, modulating the sample rate to account for slight variations in the souce timebase (wow and flutter), and noise sources in general. This part of the project is slightly different than the previous parts, but the idea (recover sound information from different but similar sources) is the same.

Current Limitations:

  1. Restricted to 44ksample/second wav files with just one DATA section.

  2. 0.1.2 two way compare is limited to stereo wav files.

Inspiration for the project:

Bit errors on desktop PCs are rare, but can happen. My computer system developed this problem when the cooling fan for the bridge chip seized up. As it turned out, the system appeared to run fine for an unknown number of months. I did not suspect any problem until I tried to install a printer driver. The installation failed at different points after several attempts. Also, about once every two weeks, the Windows XP operating system had crashed with a blue screen error. This was unusual compared to the previous year of fairly trouble free computing (I had first attributed these problems to one of the patches Microsoft keeps sending out and expected some new patch to fix things).

After the driver install problem, I suspected a disk drive problem. I copied a large file (40 megabytes) then did a byte by byte compare in dos. Something like two bytes did not match. After several repeats of this test, I noticed that: mismatches between the same two files were constant, the errors occurred in one of two bit positions, and new errors occurred only when creating a new file. Then I noticed the bridge chip cooling fan was frozen. Being a hardware person, I fixed the fan, and repeated the test. At first I thought success, but eventually the failures started occur again. The bridge chip was damaged internally (a speed fault or a logic voltage level not being quite achieved) and it corrupted data being written to the disk at a very low bit error rate. If I copied files from the disk and compared on another system, there seemed to be no problem: this eliminated the PCI bus. Somewhere errors were occuring that could not be caught by parity or checksum tests. And the errors were infrequent enough to not fail most applications and web browsing.

A problem like this could cause a number of files to be corrupted in such small ways that they would never be noticed. This was the inspiration for this wav file compare and repair project.

An idea for another project would be an audit process where a background systems check might try to uncover these failings, but that's a bit beyond my abilities.

07/23/04