Jean-Louis Durrieu, PhD: Research Page

Short Bio

I was born on August 14th, 1982, in Saint-Denis, Reunion Island, France. I received the State Engineering degree and the Ph.D. degree, in the field of audio signal processing, from Télécom ParisTech (formerly known as ENST or Télécom Paris), Paris, France, in 2006 and in 2010, respectively. I am currently a (post-doctoral) research scientist in the Signal Processing Laboratory 5 (LTS5) at the EPFL, Lausanne, Switzerland.

My main research interests are statistical models for audio signals, with applications to language learning technologies, musical audio source separation and music information retrieval.

[top]

Research Interests

Methodology
- Statistical Signal Processing:
  - Bayesian inference
  - Stochastic models
  - Algorithms: Expectation-Maximisation (EM), Variational Approximation
- Signal production models:
  - Source/Filter models for speech and music instruments
  - Gaussian Mixture Models (GMM), Hidden Markov Models (HMM)
  - Non-negative Matrix Factorisation (NMF)

Applications
- (Music) Audio Signal Processing: analysis, transcription, separation, visualisation
- Speech processing: language identification, representations, enhancement

I enjoy investigating mathematical and statistical models for audio, music or speech processing. More specifically, I have been working on using many source/filter models for voice and various instruments along with techniques that decompose the signal into a basis of elementary components, namely the Nonnegative Matrix Factorization (NMF). This research was particularly successful at two tasks, the audio melody extraction and the lead instrument separation from the background music.

I also worked on language learning technologies and "computer-aided pronunciation training" (CAPT) techniques, in collaboration with SpeedLingua. I have also collaborated with the Montreux Jazz Digital Project, at the EPFL. We have further developed audio source separation techniques to produce multiple sound tracks, allowing various remixing applications (Karaoke, spatialisation, ...), as well as Graphical User Interfaces (GUI) to demonstrate the results at the Montreux Jazz Festival.

[top]

Publications, Scientific dissemination

JOURNAL PAPERS:

2013:

J.-L. Durrieu and J.-Ph. Thiran, Source/Filter Factorial Hidden Markov Model, with Application to Pitch and Formant Tracking, IEEE Transactions on Audio, Speech and Language Processing, August 2013.

2011:

J.-L. Durrieu, B. David and G. Richard, A Musically Motivated Mid-Level Representation For Pitch Estimation And Musical Audio Source Separation, IEEE Journal of Selected Topics on Signal Processing, October 2011, Vol. 5 (6), pp. 1180 - 1191. (First submission: September 2010). [pdf][link on IEEExplore][web][preprint][copyright]

2010:

J.-L. Durrieu, G. Richard, B. David and C. Févotte, Source/Filter Model for Main Melody Extraction From Polyphonic Audio Signals, IEEE Transactions on Audio, Speech and Language Processing, special issue on Signal Models and Representations of Musical and Environmental Sounds, March 2010, vol. 18 (3), pp. 564 -- 575. [pdf] [ link on IEEExplore]
This research was partly funded by the OSEO project Quaero and partly funded by the European K-Space project.

2008:

C. Févotte, N. Bertin and J.-L. Durrieu, Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis, Neural Computation, March 2009, Vol. 21, No. 3: 793 - 830. [audio samples] [bibtex]
This research was partly funded by the European K-Space project.

PEER-REVIEWED INTERNATIONAL CONFERENCES:

2013
- A. Liutkus, J.-L. Durrieu, L. Daudet and G. Richard, An Overview of Informed Source Separation, the International Workshop on Image and Audio Analysis for Multimedia Interactive Services, July 3-5, 2013, Télécom ParisTech, Paris, France.

2012

J.-L. Durrieu and J.-P. Thiran, Musical Audio Source Separation Based on User-Selected F0 Track, the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA), March 12-15, 2012, Tel-Aviv, Israel. [web]
J.-L. Durrieu, F. Kelly and J.-P. Thiran, Lower and upper bounds for approximation of the Kullback-Leibler divergence between two Gaussian mixture models , IEEE ICASSP, March 25-30 2012, Kyoto, Japan.

2011

J.-L. Durrieu and J.-P. Thiran, Sparse Non-Negative Decomposition Of Speech Power Spectra For Formant Tracking, IEEE ICASSP, May 22-27 2011, Prague, Czech Republic.
A. Ozerov, C. Févotte, R. Blouet and J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, IEEE ICASSP, May 22-27 2011, Prague, Czech Republic.
F. Weninger, J.-L. Durrieu, F. Eyben, G. Richard, B. Schuller, Combining Monaural Source Separation With Long Short-Term Memory for Increased Robustness in Vocalist Gender Recognition, IEEE ICASSP, May 22-27 2011, Prague, Czech Republic.

2010

R. Foucard, J.-L. Durrieu, M. Lagrange and G. Richard, Multimodal similarity between musical streams for cover version detection, ICASSP, 14 - 19 March 2010, Dallas, Texas, USA.

2009

J. Weil, T. Sikora, J.-L. Durrieu, G. Richard, Automatic generation of lead sheets from polyphonic music signals, ISMIR, 26-30 October 2009, Kobe, Japan. [pdf] [poster (1.1Mo)] [ temporary lead-sheet examples] [lead-sheet examples at TUB]
This research is partly funded by the OSEO project Quaero and partly funded by the European K-Space project.

J.-L. Durrieu, A. Ozerov, C. Févotte, G. Richard and B. David, Main Instrument Separation From Stereophonic Audio Signals Using A Source/Filter Model, EUSIPCO, 24-28 August 2009, Glasgow, Scotland. [pdf] [presentation (4.5Mo)] [audio examples]
This research is partly funded by the OSEO project Quaero and partly funded by the French ANR project SARAH (StAndardisation du Remastering Audio Haute-definition)
J.-L. Durrieu, G. Richard and B. David, An Iterative Approach to Monaural Musical Mixture De-Soloing, ICASSP, April 19-24 2009, Taipei, Taiwan. [pdf] [poster] [audio examples] [copyright]
This research is partly funded by the European K-Space project and by the OSEO project Quaero.

2008

J.-L. Durrieu, G. Richard and B. David, Singer melody extraction in polyphonic signals using source separation methods, ICASSP 2008. [pdf] [ poster] [ audio examples] [copyright]
This research is funded by the European K-Space project.

SEMINARS

J.-L. Durrieu, Automatic Extraction of the Main Melody from Polyphonic
Music Signals. With Application to Transcription and Separation, seminar at the EPFL, Lausanne, Switzerland, 4th December 2009. [presentation (4.7Mo)]

J.-L. Durrieu, Automatic Separation and Transcription of the Main Melody from Polyphonic Music Signals, seminar at IRCAM, Paris, 30th November 2009. [ presentation (4.8Mo)]
J.-L. Durrieu, Automatic Transcription and Separation of the Main Melody from Polyphonic Music Signals, seminar at METISS group, IRISA, Rennes, 16th April 2009. [ presentation (13.3Mo)]

THESES

PhD thesis: Automatic Transcription and Separation of the Main Melody in Polyphonic Music Signals, defended on Friday May 7th 2010, 2pm, at Telecom ParisTech. [ pdf] [web]
Master thesis: A Query_By_Humming System , July 2006, research internship at the "FIT" laboratory, under Professor Xu MingXing's supervision. [pdf]

EVALUATION CAMPAIGNS:
- 2011:
  - SiSEC, Professionally Produced Music Recordings. Mostly best SDR on each submitted individual extracted vocals. [web] [results (dev set) (test set)]
- 2009:
  - Music Information Retrieval Evaluation eXchange (MIREX), Audio Melody Extraction (AME). Best overall accuracy on MIREX08 dataset, 2nd best global overall accuracy. [web] [results]
- 2008:
  - Music Information Retrieval Evaluation eXchange (MIREX), Audio Melody Extraction (AME). Best overall accuracy on MIREX08 dataset, 2nd best global overall accuracy. [web][results]
  - SiSEC, Professionally Produced Music Recordings. [web][results]

ACTIVITIES AS A SCIENTIFIC PEER:
- Reviewer for international conferences and international journals: IEEE Signal Processing Letters (SPL), IEEE Journal of Selected Topics in Signal Processing (JSTSP), IEEE Transactions on Audio, Speech and Language Processing (TASLP), International Conference of the International Society for Music Information Retrieval (ISMIR), Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

MISC.

2009:

J. Weil, T. Sikora, J.-L. Durrieu and G. Richard, Beat Tracking Using The Delta-Phase Matrix, research report, Telecom Paristech, 2009. [ tech-rep]

2007:

J.-L. Durrieu, G. Richard and B. David, Single Sensor Singer/Music Separation Using A Source/Filter Model Of The Singer Voice , ACOUSTICS'08. [ abstract] [ poster][audio examples]

SSMS2007, summer school :

the abstract for my research interests

my presentation poster

[top]

Software

pyFASST: Flexible Audio Source Separation Toolbox, Python implementation of the original Matlab version. [html doc] [PyPi] [Github]
Main instrument source separation program, Python/NumPy/SciPy. See this companion site, for our JSTSP'2011 article.
Fundamental frequency saliance visualization, Vamp Plugin. [github page]
User-guided source separation, Python/NumPy/SciPy/PyQt4 or PySide. See the SiSEC 2011 - LVA/ICA 2012 companion website.
Some pointers to research and GUI programs from my previous webpage.

[top]

Links

Friends, colleagues and co-authors:
- Those with permanent positions (as of 12/01/2012):
- And the others (which you may find using your favorite Internet search engine): Simon Arberet, Romain Hennequin, Antoine Liutkus, Thomas Maugey, Laurent Oudre, Alexey Ozerov, Jan Weil, ...
- Myself: Facebook, Google Plus, LinkedIn, Twitter, signal processing (DSP) on stackexchange.com, GitHub

[top]

Personal

I play the oboe and the saxophone, I like playing in chamber music orchestras (woodwind quintets, wind octets, sax quartets... or alone), and I am keen on table tennis and martial arts (Taiji, Nunchuks).

[top]

Copyright information for the publications

Copyright 2008 IEEE. Published in the IEEE 2008 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), scheduled for March 30 - April 4, 2008 in Las Vegas, Nevada, U.S.A. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.

Copyright 2009 IEEE. Published in the IEEE 2009 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), scheduled for April 19 - 24, 2009 in Taipei, Taiwan Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.

� 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.