Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing between two audio #23

Open
leonardltk opened this issue Apr 9, 2020 · 3 comments
Open

Comparing between two audio #23

leonardltk opened this issue Apr 9, 2020 · 3 comments

Comments

@leonardltk
Copy link

Hi,

according to the README, everything seems to be end-to-end when running benchmark across the whole covers80 dataset.

Is there a way to simply compare two audio using any of the algorithms, and determine whether they are indeed cover versions of each other?

@furkanyesiler
Copy link
Owner

Hi @leonardltk!

The way almost all of these algorithms (included the ones that are not in the repo) work is that they estimate a similarity/dissimilarity score among a collection of songs. These similarity scores are sorted to compute a number of performance metrics common in Information Retrieval tasks, e.g. mean average precision, mean rank, number of relevant items in top-1. As a result, the absolute values of these similarity scores do not necessarily mean something. What we care more is that when we give a query, whether the algorithm returns a relevant item (in our case, a cover) in the first retrieved results or not.

Based on the algorithm that you use, you can check the distance (or similarity score) distributions to set some thresholds. For example, if the distances of covers lie between 0 and 0.4, and the distances of non-covers lie between 0.3 and 0.9, you can then set a threshold considering whether precision or recall is more important to you. Keep in mind that these distance distributions are likely to differ depending on the algorithm that you use.

I hope this answers your question. Please let me know if you have any further questions!

@leonardltk
Copy link
Author

Thanks for your reply!

In that case, how do you suggest i go about proceeding with this problem ?

Consider a database of 80 original songs. I have a collection of 1000 queries. it is unknown how many of these queries are a cover of the 80, and there might be none at all.

An approach im considering right now is this:
Considering how acoss.algorithms.*.similarity() are done, it seems they are deterministic, and fixed a threshold value of lets say 0.4, and in test time, i just follow how the similarity is computed, and decide whether the test file is a cover song of the original.

For example, rqa_serra09.py has the following similarity measure :

    def similarity(self, idxs):
        for i,j in zip(idxs[:, 0], idxs[:, 1]):
            query = self.load_features(i)
            reference = self.load_features(j)
            # create instance of cover similarity algorithms from essentia
            crp_algo = ChromaCrossSimilarity(frameStackSize=self.m, 
                                            frameStackStride=self.tau, 
                                            binarizePercentile=self.kappa, 
                                            oti=self.oti)
            alignment_algo = CoverSongSimilarity(alignmentType='serra09', distanceType='symmetric')
            # compute similarity
            csm = crp_algo(query, reference)
            _, score = alignment_algo(csm)
            for key in self.Ds.keys():
                self.Ds[key][i][j] = score

Does it mean i can use this function during test time ?

    def test_ij(self, i, j, threshold):
        query = self.load_features(i)
        reference = self.load_features(j)

        # create instance of cover similarity algorithms from essentia
        crp_algo = ChromaCrossSimilarity(frameStackSize=self.m, 
                                        frameStackStride=self.tau, 
                                        binarizePercentile=self.kappa, 
                                        oti=self.oti)
        alignment_algo = CoverSongSimilarity(alignmentType='serra09', distanceType='symmetric')
        
        # compute similarity
        csm = crp_algo(query, reference)
        _, score = alignment_algo(csm)

        return True if score >= threshold else False

1 problem i foresee using this is that the scores are unnormalised, as i did not take into account normalize_by_length.

@mayassin
Copy link

mayassin commented Oct 15, 2021

Hey @leonardltk ! I am currently facing a similar problem as the one you mentioned. How were you able to solve it ?
My problem is that given an audio cover, I am trying to retrieve the best matching song to it if it exists and return null if it does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants