JV Cossu


My name is Jean-Valère Cossu coming from Corsica. I'm currently pursuing opportunity to offer my research expertise.
I just left my Research Engineer position at Vodkaster where I mainly worked on recommender systems within the ANR (French National Research Agency) founded project ALICIA. I also had interets in Data Science, Networking (Streaming) and Multimedia encoding.
I obtained my Ph.D in Computer Science on E-Reputation Monitoring in the Computer Science Laboratory (LIA) at Avignon University (France) under the supervision of Professor Marc El-Bèze, Dr. Juan-Manuel Torres-Moreno and Dr. Eric SanJuan.
My research interests lie in online reputation management (ORM) for companies and politics in both Natural Language Processing (NLP), Information Retrieval (IR) and Social Network Analysis (SNA) fields. I am also interested in digital humanities and links between data and ethics.
My work was mainly focused on the border of Machine Learning (ML) between classification and clustering using documents shared on the Web 2.0 (Blogs and Twitter) and their associated metadata.

Before I came to the NLP and Speech team of the LIA, I received my master degree in Computer Science with networks specializing at the University of Avignon (2010 - 2012). During my internship at the Lab I worked on opinion mining in recommender systems with Professor Marc El-Bèze. Based on a "cinema" social network (Vodkaster) the issues were to analyse the (dis)likes of users in order to predict their opinion on new-coming movies. I also got a background in Networking management.

I gave various courses in computer science at the University of Avignon: Networking, C/C++ and Java (Networking) programming, Internet Engineering. I was associate supervisor for master students' project over deep neural networks in reputation analysis. I also was associate supervisor for master student internship over Partial Mean Square Path Modelling (PLS-PM) in football betting.

Research Topics
Current Works (Vodkaster/ALICIA)
Projects
Evaluation Campaigns
Ph.D. thesis
Master thesis

Research Topics

  • Natural Language Processing
  • Machine Learning
  • Information Retrieval


Current work (Vodkaster/ALICIA)

At Vodkaster I am currently mainly working on movies recommender systems within the ALICIA (ANR-13-CORD-0020) project. My work consists in studying features to improves recommendation. It also includes modelling and analysis dynamics of movies reputation on the website. My research area also includes Network broadcasting and Video coding. I am also interested in modellig and improving customers' satisfaction within network analysis.


Ph.D. Thesis

Subject: Analysing entities web representation over Web 2.0
PhD Supervisor: Marc El-Bèze, Juan-Manuel Torres-Moreno and Eric Sanjuan
My main work consists in is about entities reputation representation identification. There are two major intentions: analyse and visualisation.
The web and politics context add difficulties to the analyse side. On the web there many bias, the first one could be that the web is offering more coverage to negative messages. Politics is like shifting and people can easily change their opinion and each new fact become an emerging concept. When it comes to modeling of the reputation the contexts add other difficulties such as the point-of-view and how to estimate the right (temporal) window. The project includes different working parts, from manual annotation with domain experts to machine learning with automatic annotation and clustering.


Imagiweb Project

This project is highly related to the domain of Sentiment Analysis, and more specifically to Opinion Mining. The main idea is to detect “what people think about a given entity” from documents content. This idea is usually implemented within a more realistic task: classifying the opinion expressed about something into a set of predefined polarities (e.g., positive vs. negative or neutral). The nature of vocabulary used in tweets limits the use of existing Sentiment Lexicon. This project also aims to study on which aspect of the entity the opinion was expressed. Although it could be seen as Topic Detection it is harder since the topic set has been defined by experts as a concept level which may never be seen in the contents.
Politics has already been addressed in previous works but mostly in English that deals with the US politics and rarely with the precision expected in the project. Furthermore, a dataset built with the involvement of specialists in political science will be provided to the community.

Within the Imagiweb project my work consists in modelling and analysis dynamics of web reputation. Our objectives aim to help politics researchers and communication-team of the French main electric utility company (EDF) by proposing them tools that can provide automatic topic and opinion annotation on documents dealing with their reputation (tweets about politicians and about the “EDF” entity). The project covers several issue such as Active Learning, Natural Language Processing in Big Data Mining. My research activities cover document clustering and categorization using several Machine Learning methods. I also studies the impact of external features in the retweet behaviour.


Evaluation Campaigns

RepLab'2014
RepLab 2014 provides a extension to RepLab 2013 tasks. It focused on stress categorization task (using Reputation Standards from the RepTrack Framework) and the characterization of Twitter profiles (Author Profiling) as a complement to CLEF PAN challenge. We tried to tackle both problems with statistical NLP-based classifiers with the simplicity and re-usability philosophy considering a matching dogma as mainstream process.
We obtained competitive results in Author Profiling subtasks.
The Reputation Dimensions classification task looks like the Topic Categorization aspect of the Imagiweb project that's why we made further experiments
My work consisted in team management and participation to all subtasks.

RepLab'2013
RepLab 2013 is an evaluation challenge focusing on the problem of monitoring the reputation of entities in Twitter. It consists in several tasks such as entity name disambiguation (Is the tweet about the entity?), reputation's polarity detection (Does the tweet have positive or negative implications for the entity’s reputation?), topic detection (What is the issue relative to the entity is discussed in the tweet?) and topic ranking (Is the topic a reputation alert that deserves immediate attention?). The provided dataset contained tweets in two languages: English and Spanish. We mainly tried to investigate how much Speech Recognition and Information Retrieval systems can answer the issues in a reputation management context (filtering and polarity tasks 1 and 2) and how simple NLP-based classifiers can perform over ranking and clustering tasks (task 3 and 4). We obtained competitive results in each subtasks.
My work consisted in team management, merging all systems and of course participation to several subtasks with my own ideas.

Deft’2013
Deft 2013 edition addressed a new application domain on a theme that has been studied in an evaluation campaign in the past (Computer Cooking Contest): cooking recipes. We focused on two analysis functions in DEFT2013, document classification (task 1 to 3) and information extraction (task 4), in a speciality domain.
My participation consisted to be a expert annotator to evaluate subsets of systems submission as if we were in an active learning process.


Master Thesis

Subject: Opinion mining in a movie recommender system

Master Supervisor: Marc El-Bèze

This Master thesis studies an opinion mining system over a movies social network. This method relies on Natural Language Processing over the users reviews. Several aspects were covered, from the user point-of-view: what does he like or dislike in movies and from the movie side: what the main target of the movie. This analyse is used to propose an argued movie recommendation for each user.


International Conferences

Intweetive Text Summarization

Cossu J-V., Torres-Moreno, J. M, San-Juan E. and El-Bèze M.

14th Mexican International Conference on Artificial Intelligence (MICAI), Mexico (Mexico) October 25-31 2015

Multi-Dimensional Reputation Modeling using Micro Blog contents

Cossu J-V., San-Juan E., Torres-Moreno, J. M and El-Bèze M.

22nd International Symposium on Methodologies for Intelligent Systems, Lyon (France) October 21-23 2015

Detecting Real-World Influence Through Twitter

Cossu J-V., Dugue N. and Labatut V.

The Second European Network Intelligence Conference, Karlskrona (Sweden) September 21-22 2015

NLP-based classifiers to generalize experts assessments in E-Reputation

Cossu J-V., Ferreira E., Gaillard J., Janod K. and El-Bèze M.

Sixth International Conference of the CLEF initiative, Toulouse (France) September 8-11 2015

Automatic Classification and PLS-PM Modeling for Profiling Reputation of Corporate Entities on Twitter

Cossu J-V., San-Juan E., Torres-Moreno, J. M and El-Bèze M.

20th International Conference on Application of Natural Language to Information Systems (NLDB 2015), Passau (Germany) June 17-19 2015

An opinion mining Partial Least Square Path Modeling for football betting

El Hamdaoui M. and Cossu J-V.

PhD Session of the 7th European Conference on Machine Learning and Practice of Knowledge Discovery in Databases, Nancy (France) September 15-19 2014

Towards the improvement of topic priority assignment using various topic detection methods for e-reputation monitoring on Twitter

Cossu J-V., Bigot B., Bonnefoy L. and Senay G.

19th International Conference on Application of Natural Language to Information Systems (NLDB 2014), Montpellier (France) June 18-20 2014

Journal

A review of features for the discrimination of twitter users: application to the prediction of offline influence

Cossu J-V., Labatut V. and Dugue N.

Social Network Analysis and Mining : SI Diffusion of Information and Influence in Social Networks

Special Issue on Diffusion of Information and Influence in Social Networks (2016), 10.1007/s13278-016-0329-x

Bilingual and Cross Domain Politics Analysis

Cossu J.-V, Abascal R., Molina A., Torres-Moreno, J. M. and SanJuan, E.

Research in Computing Science (ISSN 1870-4069)

Issue 85 (2014), page 9–19

International Workshop

Machine Learned Annotation of tweets about politicians' reputation during Presidential Elections: the cases of Mexico and France

Cossu J.-V, Abascal R., Molina A., Torres-Moreno, J. M. and SanJuan, E.

Bilingual and Cross Domain Politics Analysis

Cossu J.-V, Abascal R., Molina A., Torres-Moreno, J. M. and SanJuan, E.

Avances en la Ingeniería del Lenguaje y del Conocimiento

2nd International Symposium on Language & Knowledge Engineering, Puebla (Mexico) 4-5 December 2014

National Conferences

Etude de l'image de marque d'entités dans le cadre d'une plateforme de veille sur le Web social

Khouas L., Brun C., Peradotto A., Cossu J-V., Boyadjian J. and Velcin J.

22ème Conférence sur le Traitement Automatique des Langues Naturelles, (DEFT/TALN 2013), Caen (France) June 22-25 2015

Recherche et utilisation d'entités nommées conceptuelles dans une tâche de catégorisation

Cossu J-V., Torres-Moreno J-M. and El-Bèze M.

20ème Conférence sur le Traitement Automatique des Langues Naturelles, (DEFT/TALN 2013), Sables d’Olonne (France) June 17-21 2013

Challenges
LIA@RepLab 2014 : 10 systems for 3 tasks

Cossu J.-V., Janod K., Ferreira E., Gaillard J. and El-Bèze M.

Replab : An evaluation campaign for Online Reputation Management Systems

Fifth International Conference of the CLEF initiative, Sheffield (UK) 15-18 September 2014

LIA@RepLab 2013

Cossu J.-V., Bigot B., Bonnefoy L., Morchid M., Bost X., Senay G., Dufour R., Bouvier V., Torres-Moreno J.-M. and El-Bèze M.

Replab : An evaluation campaign for Online Reputation Management Systems

Fourth International Conference of the CLEF initiative, Valencia (Spain) September 23-26 2013

Systèmes du LIA à DEFT'13

Bost X., Brunetti I., Cabrera-Diego L-A., Cossu J-V., Linhares A., Morchid M., Torres-Moreno J-M., El-Bèze M. and Dufour R.

Défi Fouille de Texte (DEFT/TALN 2013), Sables d’Olonne (France) June 17-21 2013

National Workshop
Contextualisation de messages courts: l’importance des métadonnées

Cossu J-V., Gaillard J., Torres-Moreno J-M. and El-Bèze M.

Conférence Francophone sur l'Extraction et la Gestion des Connaissances (EGC 2013), Toulouse (France) January 28 2013

Others

Analyser l'image de marque d'entités sur le web. Revue du projet ImagiWeb.

Velcin J., Peradotto A., Khouas L., Cossu J-V., Dormagen J-Y. and Brun C.

Ingénierie des Systèmes d'Information 19(3): 159-162 (2014)

Posters

Slides


194 teaching hours, including Practical work &Tutorial (Detailed information):

Year 2014/2015 Teaching

Tutorials & Practical WorkHours
C++ Base Programming (B.Sc.)17
Advanced C++ Programming (B.Sc.)25.5
NLP Database integration and graph analysis in Social Networks (M.Sc)21
Internship Supervisor (M.Sc)-
Project Supervisor (M.Sc) : Hypervisor vs Docker-
Total63.5


Year 2013/2014 Teaching

Tutorials & Practical WorkHours
C++ Base Programming (B.Sc.)24
Advanced C++ Programming (B.Sc.)25.5
Network infrastructure & Social Networks (M.Sc)21
Internship Supervisor (M.Sc) : PLS-PM NLP in football betting-
Total70.5


Year 2012/2013 Teaching

Tutorials & Practical WorkHours
C++ Base Programming (B.Sc.)20
Advanced C++ Programming (B.Sc.)24
Network Programming (M.Sc)21
Project Supervisor (M.Sc) : Deep Learning NLP-
Total65


I'm currently pursuing opportunity to offer my research expertise. I just left my Research Engineer position at Vodkaster where I mainly worked on recommender systems within the ANR (French National Research Agency) founded project ALICIA.
I used to collaborate at the University of Avignon within the context of the ANR project : Imagiweb about entities' (individuals and companies) reputation analysis over Web 2.0

Resume

Thesis

Research Keyworks

Natural Language Processing
Information Retrieval
Online Reputation Management and Monitoring
Machine Learning
Social Media Analysis
Contents Ranking and Selection (Summarization)
User-generated contents Mining and Categorization
User Profiling (Influence, SCC, Age, Gender, Personality, Political Orientation)
Item Modeling from reviews
Recomender system (or other cultural products)
Artificial Intelligence


Education

09-2012 -- 08-2015 Ph.D. in Computer Science Specialized in Natural Language Processing applied to Online Reputation Analysis, LIA - University of Avignon (France).

09-2010 -- 08-2012 Master of Science, Specialized Networking and Natural Language Processing
CERI - University of Avignon (France).


Employment Experience

10-2015 -- 10-2016 Research Engineer at Vodkaster - Paris (France).

09-2012 -- 08-2015 Lecturer at CERI - University of Avignon (France).
Various courses in Computer Science: Networking, C/C++ and Java (Networking) programming
Introduction to Social Network Analysis.

11-2011 -- 08-2012 Junior Research Assistant at LIA - University of Avignon (France).


Languages

French (Native), English, notions of Italian and Spanish.


Teaching

C++ Base Programming
C/C++ and Java (Networking)
Network and Telecommunications
Social Network Analysis


References

Philippe Fillinger
i-Roe
Phone: +33 6 16 55 68 15
Email: philippe.fillinger@i-roe.com

 


Chris Navas
Vodkaster / RIPLAY SAS
23 Rue Boyer 75020 Paris, France
Phone: +33 6 20 54 43 14
Email: chris@vodkaster.com

 


Professor Marc El-Bèze
University of Avignon
339 chemin des Meinajariès
84911 Avignon, France
Phone: +33 490 843 508
Email:marc.elbeze@univ-avignon.fr

Juan-Manuel Torres-Moreno
Associate Professor (HDR)
University of Avignon
339 chemin des Meinajariès
84911 Avignon, France
Phone: +33 490 843 568
Email:juan-manuel.torres@univ-avignon.fr

Eric SanJuan
Associate Professor
University of Avignon
339 chemin des Meinajariès
84911 Avignon, France
Phone: +33 490 843 568
Email:eric.sanjuan@univ-avignon.fr



37 Rue Saint Sébastien
75011 Paris
FRANCE


jvcossu@gmail.com


+33 665 630 728


Twitter @jvcossu
Google Scholar
DBLP
LinkedIn
Viadeo