Machine Learning Challenges
CIKM Cup 2016 Track 1: Cross-Device Entity Linking Challenge
I developed an ML approach to find the same user across multiple devices and build accurate user identity. My solution got the rank 5 among 300+ teams. My technical report can be found at https://arxiv.org/abs/1612.07117 and implementation https://github.com/namkhanhtran/cikm-cup-2016-cross-device
ACM Recsys Challenge 2016
I developed a hybrid learning method, which combined content-based and collaborative filtering approaches, to predict job posting that a user will positively interactive with (e.g. click, bookmark). My implementation got the rank 30 among 109 teams.
eLabour: Interdisciplinary centre for IT-based qualitative sociological research
The BMBF-funded Centre will focus on interdisciplinary methods in the novel area of secondary analysis of qualitative material and will foster their uptake. Contribution: Implemented novel and tailored IT methods such as search and contextualization for sociological secondary analysis.
ForgetIT: Concise Preservation by Combining Managed Forgetting and Contextualized Remembering
ForgetIT is the large-scale EU project (total budget: € 9 million) that aims to develop the next-generation digital preservation framework with focus on personal and organisational settings. Contribution: Proposed and implemented ML methods in the contextualization package, which bring context to object (text) to be preserved.
GuteArbeit: Exploring and Contextualizing Qualitative Data for Secondary Analysis
In social sciences, a tremendous body of data is being collected through observations and interviews. Given their long-standing research tradition, scientists are accumulating information about behavior, attitudes and beliefs at specific time frames-“realities” that cannot be captured later. “Gute Arbeit” enables intelligent access to such qualitative data gathered within diverse contexts through developing, adapting and advancing approaches from Information Retrieval and Data Mining. Contribution: Proposed a method based on Latent Dirichlet allocation (i.e. Topic Cropping) and developed a prototype to support sociologists exploring qualitative studies.