[mirtoolbox] Re: novelty question

  • From: Olivier Lartillot <olartillot@xxxxxxxxx>
  • To: mirtoolbox@xxxxxxxxxxxxx
  • Date: Wed, 10 Nov 2010 12:27:12 +0200

Hi Jens,

Yes, you are absolutely right, and I thank you for this interesting remark.

It seems I just found an answer to your question:

- Actually, until MIRtoolbox 1.2.3, the novelty curve was computed through a 
correlation of the gaussian kernel *along the diagonal of the time/time 
similarity matrix*.

- In more recent versions, on the contrary, in order to optimize significantly 
the computation (without introducing in theory any error), the novelty curve is 
computed through a correlation of the kernel *along the zero-lag line of the 
time/lag similarity matrix*.

Of course, the kernel for each version has to be different, old versions use a 
time/time kernel, and new versions use a time/lag kernel.

It appears that these two versions will actually give different results. In 
your ideal example, the old version corresponds to a correlation between a 
\/-shape kernel and a \/-shape matrix (We just need to take half of the matrix 
here, since it is symmetric), leading to a clear peak. In the new version, on 
the contrary, this corresponds to a correlation between a |/-shape kernel with 
a |/-shape matrix. Why "|/"? Because before the transition in your example, all 
frames are absolutely similar to their previous frames, whereas at the 
transition point, the new frame is suddenly completely dissimilar with all 
previous frames: this corresponds to the vertical bar (|) in this |/-shape. 
That is why the novelty curve in the newest version has a valley followed by a 
peak.

What I propose is to offer both versions, the diagonal one and the time/lag 
one, available in the next version of MIRtoolbox, that I am preparing now. I 
don't know if the difference between versions makes any change in the resulting 
novelty peak detection. Maybe the time/lag version could be proposed as default 
option, because it is much faster to compute?

Regards,

Olivier

Jens Hjortkjær kirjoitti 3.11.2010 kello 23.05:

> Hi Olivier,
> Thanks for your reply. I understand your point about the small kernel shape, 
> but I have the problem with any size kernel (sorry if the example was badly 
> chosen). For instance, using the default kernel size in the toolbox: 
> fs=1000;
> t=0:1/fs:20.999;
> L=length(t);
> y=[sin(2*pi*220*t(1:L/2)) sin(2*pi*319*t(L/2:L))];
> a=miraudio(y,fs);
> mirnovelty(a)
> This also looks strange or at least different from the earlier version of the 
> toolbox that I used (1.2.3). Is something changed in the implementation? 
> 
> Cheers,
> Jens
> 
> <novelty_1_2_3.jpg>
> <novelty_1_3_1.jpg>
> On 03/11/2010, at 14.01, Olivier Lartillot wrote:
> 
>> Hi Jens,
>> 
>> Thanks for the interesting question, and sorry for the late reply.
>> 
>> The strange behavior your observed is due to the fact that you use a very 
>> small kernel. Since it is a gaussian checkerboard kernel (gaussian meaning 
>> there that this matrix is windowed by a 2D Gaussian), for a 5-sample wide 
>> matrix, that kernel does not look like a checkerboard anymore so the 2D 
>> convolution operation performed in mirnovelty gets very approximate.
>> 
>> With a bigger kernel, you begin to see a peak after the node you mentioned. 
>> That particular node comes from the fact that just before the kernel is 
>> perfectly aligned to the transition (leading to the expected peak), there is 
>> an exaggerated mismatch between the expectation given by the kernel and what 
>> happens actually just before the transition.
>> 
>> Best,
>> 
>> Olivier
>> 
>> Jens Hjortkjær kirjoitti 21.10.2010 kello 9.35:
>> 
>>> Hello all,
>>> 
>>> I just updated to mirtoolbox 1.3.1 and I'm a little confused about the 
>>> output of the mirnovelty function. I seem to be getting inverted novelty 
>>> scores: high when novelty is low and vice versa
>>> 
>>> Eg:
>>> 
>>> fs=1000;
>>> t=0:1/fs:5.999;
>>> L=length(t);
>>> y=[sin(2*pi*120*t(1:L/2)) sin(2*pi*319*t(L/2:L))];
>>> a=miraudio(y,fs);
>>> mirnovelty(a,'kernelsize',5)
>>> 
>>> Here I have 0 at the change, rather than 1. I don't think I had this 
>>> problem in earlier versions. Does anyone here have an explanation for this?
>>> 
>>> Cheers,
>>> Jens
>> 
> 

Other related posts: