26 posts

Computational appraisal of gender representativeness in popular movies

This article by Antoine Mazieres, Telmo Menezes and Camille Roth has just been published in Humanities and Social Sciences Communications. It explores the possibility of using artificial intelligence and machine learning algorithms to assess gender representativeness in popular movies. It focuses on a very simple task: Counting faces of women and men appearing in more than 3500 popular movies spanning over 3 decades. On average, over the whole dataset, only 34.52% of faces displayed in a movie are detected as female. Also, we observed a significant increase of the number of women faces. From 1985 to 1998, this ratio is of 27% and reaches a point closer to a female-male parity in the most recent period, from 2014 to 2019, with a ratio of 44.9%. Also, the diversity of situations (formally, the variance of this ratio) increases. That means that films produced recently tend to delve into a more diverse range of on-screen women-men shares.

The open-access article may be found here, accompanied by a vulgarized version in several languages.

Welcome to Govind Gandhi

Govind Gandhi just joined the team and will be working as an intern under the supervision of Camille Roth. With a background in physics, network theory and AI, Govind will focus his work on extending a framework to describe the evolution of socio-semantic networks over time, using local rules.

Govind Gandhi

Welcome to Titouan Morvan

Titouan is a mathematics student specializing in statistics and machine learning. As a research engineer, he joins the team for six months and will work under the supervision of Camille Roth within the socsemics ERC project on semantic hypergraphs with applications to debates on climate and energy policies.

Titouan Morvan

Welcome to Manuel Tonneau

Manuel Tonneau just joined the team for six months and will be working as a research engineer within the socsemics ERC project under the supervision of Camille Roth. With a background in economics and statistics, Manuel acquired a machine learning skillset in startup research teams (James, Creatext) and applied these techniques in a social science context at international research institutions (OECD, World Bank). At CMB, Manuel will combine stance detection methods and network analysis in an empirical study of echo chambers on Twitter.

Manuel Tonneau

One 3-year doctoral fellowship in computer science or related field

ERC Consolidator “Socsemics”, modeling of socio-semantic systems

In addition to one three-year post-doctoral fellowship, the team now opens one new three-year doctoral fellowship in computer science for the development of graph-theoretic and dynamic models of the emergence and stability of socio-semantic clusters in online communities, or on breakthroughs in automated content analysis by aiming at going beyond classical distributional approaches to render the linguistic complexity of utterances in web corpuses. Further information may be found in this document.

Socsemics logo

Centre Marc Bloch: open post-doctoral position! D/L February 28, 2021

Centre Marc Bloch e.V. is opening a position for a postdoctoral researcher who focus on economic economic and sociological impacts of ICT-related industries. The contract would ideally start between April 1st and October 1st, 2021, and may cover a maximal period of three years.

Individual research topics could cover the whole field, but a project in one of the four following research directions would be particularly welcome:

  • ICT-mediated job markets
  • Local markets
  • Work automation and algorithmic management
  • Emergence of consumers-producers

You may find the detailed Call for Application here.

Welcome to Romain Avouac

Romain Avouac just joined the team for a year under the supervision of Camille Roth and within the socsemics ERC project as a research assistant. He will focus on developing NLP approaches that go beyond classical distributional approaches in order to better assess the semantic similarity between actors’ online positions. Romain works as a trainee public statistician at the French statistical institute (INSEE).

Romain Avouac

Welcome to Dougal Shakespeare

Dougal Shakespeare just joined the team for three years under the supervision of Camille Roth and within the ANR RECORDS project to purse a PHD in computational social science. He will research the role of algorithmic guidance on music streaming platforms. His work focuses on exploring differentiations between algorithmic and organic music consumption behaviours, tracing the degree to which commonly deployed filtering algorithms may expand or rather, constrain the diversity of one’s music preference.

Dougal Shakespeare

Welcome to Jonathan St-Onge

A former philosopher of science and cognitive scientist, Jonathan St-Onge just joined the team for three years under the supervision of Camille Roth and within the socsemics ERC project to purse a PHD in computational social science. He mixes and matches probabilistic network models with different semantic representations to better understand how the nested hierarchy of both social and semantic structures come together in digital niches as socio-semantic bubbles. At a metalevel, he is greatly interested by how and why scientists study models of the things rather than the things themselves.

Jonathan St-Onge

Team presentations at Sunbelt 2020

The Sunbelt virtual conference, INSNA’s flagship international conference on social network analysis, took place online during July 13-17, 2020 and where the team presented two communications and one poster.

Telmo Menezes introduced his work with Camille Roth about a natural language representation model called “semantic hypergraphs” which enables the extraction of information from free text, including for instance identification of claims, conflicts, and beliefs of actors. Nikita Basov and Camille Roth presented their tribute to John Mohr based on their article published in Poetics as “The Socio-Semantic Space of John Mohr”, addressing the visualization of sizable hybrid socio-semantic networks of co-authors and concepts surrounding a given scholar. Finally, Jonas Stein, Jérémie Poiroux and Camille Roth presented a poster largely based on Jonas’s masters internship work about the intersection of user’s structural and semantic confinement on Twitter.

Welcome to Katrin Herms

Katrin Herms just joined the team for three and a half years under the supervision of Camille Roth and within the socsemics ERC project to pursue an interdisciplinary PhD project in sociology linking social network analysis of polarized internet communities with face-to-face Interviews. Linked with her practical background in journalism, her main interests are discourse analysis and social dynamics emerging around political issues in France and Germany.

Katrin Herms

Large-scale diversity estimation through surname origin inference

In 2018, Antoine Mazières and Camille Roth published in Bulletin of Sociological Methodology the article “Large-Scale Diversity Estimation Through Surname Origin Inference”. Recently, Antoine wrote an informal debriefing (in french) of the study, which gives us the chance to make a post on this site.

The abstract of the article is as follow:
The study of surnames as both linguistic and geographical markers of the past has proven valuable in several research fields spanning from biology and genetics to demography and social mobility. This article builds upon the existing literature to conceive and develop a surname origin classifier based on a data-driven typology. This enables us to explore a methodology to describe large-scale estimates of the relative diversity of social groups, especially when such data is scarcely available. We subsequently analyze the representativeness of surname origins for 15 socio-professional groups in France.

Welcome to Jonas Stein

Jonas Stein just joined the team for a four-month research internship on user confinement in Twitter networks under the supervision of Camille Roth and Jérémie Poiroux and within the project “SOCSEMICS”. Jonas has a background in agent-based simulation and social network analysis.
More information about him may be found on his LinkedIn profile.

Jonas Stein

Tubes and Bubbles – Topological confinement of recommendations on YouTube

The paper “Tubes and Bubbles – Topological confinement of recommendations on YouTube” by Camille Roth, Antoine Mazières and Telmo Menezes just got published in PLOS ONE.

Contrarily to popular belief about so-called “filter bubbles”, several recent studies show that recommendation algorithms generally do not contribute much, if at all, to user confinement; in some cases, they even seem to increase serendipity [see e.g., 1, 2, 3, 4, 5, 6].
Our study demonstrates however that this may not be the case on YouTube: be it in topological, topical or temporal terms, we show that the landscape defined by non-personalized YouTube recommendations is generally likely to confine users in homogeneous clusters of videos. Besides, content for which confinement appears to be most significant also happens to garner the highest audience and thus plausibly viewing time.

The paper is available as an open-access article. We also set up a small vulgarization website, and you may read the CNRS Press release.

Project RECORDS: open doctoral positions! D/L May 31, 2020

We are opening two doctoral fellowships from September 2020 in the framework of the ANR-funded project RECORDS that is focused on the understanding of practices and dynamics surrounding music streaming platforms.

One fellowship will be based at Géographie-cités in Paris, about the spatial dynamics underlying content consumption on streaming platforms, with music streaming as a primary case study.

The other fellowship will be based at Centre Marc Bloch in Berlin, and will address the large-scale and longitudinal study of algorithmic guidance in the context of music streaming platforms.

Please find the detailed call for application here.

New ANR-funded grant called “RECORDS”

The team hosts a new ANR-funded grant called “RECORDS” (2020-2023), focused on the understanding of practices surrounding online content platforms, and specifically in the context of musical streaming through a unique partnership with one of the major platforms in this area, Deezer.

The project generally aims at documenting the diversity of practices and behaviors on streaming platforms, understanding the effects of manual and algorithmic content recommendation, and describing the potential spatial diffusion of artists and works. RECORDS articulates quantitative and qualitative empirical protocols, by relying both on a unique source of usage data stemming directly from the platform (comprehensive listening histories on millions of users on several years) and on a large-scale survey (featuring tens of thousands of respondents) and associated interviews with a selection of consenting participants.

The project gathers about 25 researchers of diverse backgrounds including sociology, computer science and geography. It is being supervised byThomas Louail (Géographie Cités), Philippe Coulangeon (Observatoire Sociologique du Changement), Camille Roth (Centre Marc Bloch), Jean-Samuel Beuscart (Orange Labs SENSE) and Manuel Moussallam (Deezer R&D).

The kick-off meeting will take place on two days in June 2020 at the Centre de Colloques of Campus Condorcet in Aubervilliers.

Welcome to Quentin Villermet

Quentin Villermet just joined the team for a five-month MSc research internship on music recommendation algorithms and their impact on listening practices under the supervision of Camille Roth and Jérémie Poiroux and within the new ANR project “RECORDS”. Quentin has a background in artificial intelligence and his interests include bio-inspired AI, statistics and network infrastructure. More information about him may be found on his LinkedIn profile.

Sunbelt 2020 & NetGLoW 2020 : call for abstracts / socio-semantic session

Sunbelt is a the main venue for social network analysis and its 40th edition will take place in Paris in June 2020. As member of the scientific organizing committee, Camille Roth is pleased to announce the call for abstracts, oral presentations and posters. Proposals should be submitted by January 31, 2020 through this link.

Of particular interest to the team is the session “Advances in Socio-Semantic Network Analysis”, led by Iina Hellstein and co-organized by Nikita Basov, Johanne Saint-Charles, Adina Nerghes and Camille Roth. We particularly encourage submissions for this session, whose description follows.

A related session will also also take place during the conference Networks in the Global World (St. Petersburg, July 7-9, 2020): the team further encourages submissions to the “Semantic and Socio-Semantic Networks” session by February 10, 2020.

Sunbelt 2020 session on “Advances in Socio-Semantic Network Analysis”
Social actors (stakeholders, group members, organizations) are linked (or separated) both by their social ties, and the content (knowledge, beliefs, frames, claims) they share (or do not share) in their communication. This interplay between the social relationships and the content of communication is increasingly approached as a socio-semantic network intertwining social and cultural, or cognitive and relational realms, where meanings and interactions coevolve.

This organized session addresses the recent advances in socio-semantic network analysis, and invites theoretical, methodological and empirical papers contributing (but not limited) to the following themes: (1) Theorizing relationships between social structure and meaning structure; (2) Qualitative, quantitative or mixed methods to relate meaning and social relationships; (3) Multivariate socio-semantic networks; (4) Relations between semantic similarity and social ties; (5) Combining relations between stakeholders and their frames; (6) Connecting macro- and micro-level social and semantic network patterns.

Automatic Discovery of Families of Network Generative Processes

Telmo Menezes had the opportunity to present his last paper with Camille Roth entitled “Automatic Discovery of Families of Network Generative Processes” and published earlier this year [SpringerLink] [arXiv], during an oral session at the Complex Networks 2019 conference in Lisbon.

This work relies on a machine learning approach introduced by them some years ago for automatically discovering plausible and human-understandable generators that fit and help explain observable complex networks. Recently, they expanded this work to identify families of generators, and demonstrated its application in discovering a small number of such families within a large corpus of facebook ego networks. The abstract of the presentation in Lisbon offers a brief overview and can be found here (p. 225), the presentation is here.

Quali-Quantitative meeting – December 2019

The Computational Social Science Team organizes bimonthly internal meetings aimed at discussing “quali-quantitative” approaches. The point of these meetings is to present the work-in-progress carried out within the Pole’s framework and also to offer methodological workshops for training in digital approach (database generation, corpus construction, processing, and so on.). It is thus a forum for dialogue capable of generating new qualitative-quantitative research questions within the Centre Marc Bloch. Please note it will progressively transform into a computational social science seminar open to an outside audience.

In December 2019, we were pleased to listen to:

  • Mirjam Dageförde who presented her statistical work (with Emiliano Grossman) about “Selfish, not social! How voters derive their policy preferences”
  • Jérémie Poiroux about filter bubbles that Twitter users possibly contribute to build. This work was part of the Algodiv project and will be continued with Camille Roth. The presentation can be found here (in French).

Interactional and Informational Attention on Twitter

Our paper called “Interactional and Informational Attention on Twitter”, by Agathe Baltzer, Marton Karsai and Camille Roth, just got out in Information 10(8), and is featured on its cover page. This work appraises the distribution of attention at the collective and individual level on Twitter, and both from a social (users) and semantic (topics) viewpoint. We exhibit the existence of socio-semantic attentional constraints and focus effects.

Neurons spike back

This article by Dominique Cardon, Jean-Philippe Cointet and Antoine Mazières retraces the history of artificial intelligence through the lens of the tension between symbolic and connectionist approaches. From a social history of science and technology perspective, it seeks to highlight how researchers, relying on the availability of massive data and the multiplication of computing power have undertaken to reformulate the symbolic AI project by reviving the spirit of adaptive and inductive machines dating back from the era of cybernetics.
The full english version may be accessed here.

Open doctoral and post-doctoral positions ! D/L: Sept 30, 2019

The team is now opening several doctoral students and post-doctoral researchers to work under the ERC Consolidator grant Socsemics, focusing on internet echo chambers and polarization. These offers take place in an interdisciplinary context and touch a variety of domains: computational social science, political science, NLP, information visualization, sociology of the internet, social network analysis, complex network modeling, essentially.

Detailed job offers may be found here with a deadline for application set at September 30th, 2019.

Please check the team presentation video and the “Socsemics” ERC project website

Extensive information on the scientific content and context are available in the above-mentioned job offers – interested applicants may nonetheless feel free to contact Camille Roth (roth[@] to discuss this further.

Appraising algorithmic biases

“Algorithmic Distortion of Informational Landscapes”, by Camille Roth, has just been published in Intellectica 70(1):97-118 –
This review paper focuses on biases induced by recommendation algorithms. It explores the state of the art along a double dichotomy: first regarding the discrepancy between users’ intentions and actions (1) under some algorithmic influence and (2) without it; second, by distinguishes algorithmic biases on (1) prior information rearrangement and (2) posterior information arrangement.
An open-access pre-print may be downloaded here.