Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New submissions for Tue, 28 Sep 21 #277

Open
seunghyunhan opened this issue Sep 28, 2021 · 0 comments
Open

New submissions for Tue, 28 Sep 21 #277

seunghyunhan opened this issue Sep 28, 2021 · 0 comments

Comments

@seunghyunhan
Copy link
Owner

Keyword: recommendation

Learning Neural Templates for Recommender Dialogue System

  • Authors: Zujie Liang, Huang Hu, Can Xu, Jian Miao, Yingying He, Yining Chen, Xiubo Geng, Fan Liang, Daxin Jiang
  • Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2109.12302
  • Pdf link: https://arxiv.org/pdf/2109.12302
  • Abstract
    Though recent end-to-end neural models have shown promising progress on Conversational Recommender System (CRS), two key challenges still remain. First, the recommended items cannot be always incorporated into the generated replies precisely and appropriately. Second, only the items mentioned in the training corpus have a chance to be recommended in the conversation. To tackle these challenges, we introduce a novel framework called NTRD for recommender dialogue system that decouples the dialogue generation from the item recommendation. NTRD has two key components, i.e., response template generator and item selector. The former adopts an encoder-decoder model to generate a response template with slot locations tied to target items, while the latter fills in slot locations with the proper items using a sufficient attention mechanism. Our approach combines the strengths of both classical slot filling approaches (that are generally controllable) and modern neural NLG approaches (that are generally more natural and accurate). Extensive experiments on the benchmark ReDial show our NTRD significantly outperforms the previous state-of-the-art methods. Besides, our approach has the unique advantage to produce novel items that do not appear in the training set of dialogue corpus. The code is available at \url{https://github.com/jokieleung/NTRD}.

Topic Model Robustness to Automatic Speech Recognition Errors in Podcast Transcripts

  • Authors: Raluca Alexandra Fetic, Mikkel Jordahn, Lucas Chaves Lima, Rasmus Arpe Fogh Egebæk, Martin Carsten Nielsen, Benjamin Biering, Lars Kai Hansen
  • Subjects: Information Retrieval (cs.IR); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2109.12306
  • Pdf link: https://arxiv.org/pdf/2109.12306
  • Abstract
    For a multilingual podcast streaming service, it is critical to be able to deliver relevant content to all users independent of language. Podcast content relevance is conventionally determined using various metadata sources. However, with the increasing quality of speech recognition in many languages, utilizing automatic transcriptions to provide better content recommendations becomes possible. In this work, we explore the robustness of a Latent Dirichlet Allocation topic model when applied to transcripts created by an automatic speech recognition engine. Specifically, we explore how increasing transcription noise influences topics obtained from transcriptions in Danish; a low resource language. First, we observe a baseline of cosine similarity scores between topic embeddings from automatic transcriptions and the descriptions of the podcasts written by the podcast creators. We then observe how the cosine similarities decrease as transcription noise increases and conclude that even when automatic speech recognition transcripts are erroneous, it is still possible to obtain high-quality topic embeddings from the transcriptions.

MC$^2$-SF: Slow-Fast Learning for Mobile-Cloud Collaborative Recommendation

  • Authors: Zeyuan Chen, Jiangchao Yao, Feng Wang, Kunyang Jia, Bo Han, Wei Zhang, Hongxia Yang
  • Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2109.12314
  • Pdf link: https://arxiv.org/pdf/2109.12314
  • Abstract
    With the hardware development of mobile devices, it is possible to build the recommendation models on the mobile side to utilize the fine-grained features and the real-time feedbacks. Compared to the straightforward mobile-based modeling appended to the cloud-based modeling, we propose a Slow-Fast learning mechanism to make the Mobile-Cloud Collaborative recommendation (MC$^2$-SF) mutual benefit. Specially, in our MC$^2$-SF, the cloud-based model and the mobile-based model are respectively treated as the slow component and the fast component, according to their interaction frequency in real-world scenarios. During training and serving, they will communicate the prior/privileged knowledge to each other to help better capture the user interests about the candidates, resembling the role of System I and System II in the human cognition. We conduct the extensive experiments on three benchmark datasets and demonstrate the proposed MC$^2$-SF outperforms several state-of-the-art methods.

Deep Exploration for Recommendation Systems

  • Authors: Zheqing Zhu, Benjamin Van Roy
  • Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2109.12509
  • Pdf link: https://arxiv.org/pdf/2109.12509
  • Abstract
    We investigate the design of recommendation systems that can efficiently learn from sparse and delayed feedback. Deep Exploration can play an important role in such contexts, enabling a recommendation system to much more quickly assess a user's needs and personalize service. We design an algorithm based on Thompson Sampling that carries out Deep Exploration. We demonstrate through simulations that the algorithm can substantially amplify the rate of positive feedback relative to common recommendation system designs in a scalable fashion. These results demonstrate promise that we hope will inspire engineering of production recommendation systems that leverage Deep Exploration.

DemiNet: Dependency-Aware Multi-Interest Network with Self-Supervised Graph Learning for Click-Through Rate Prediction

  • Authors: Yule Wang, Qiang Luo, Yue Ding, Dong Wang, Hongbo Deng
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2109.12512
  • Pdf link: https://arxiv.org/pdf/2109.12512
  • Abstract
    In this paper, we propose a novel model named DemiNet (short for DEpendency-Aware Multi-Interest Network}) to address the above two issues. To be specific, we first consider various dependency types between item nodes and perform dependency-aware heterogeneous attention for denoising and obtaining accurate sequence item representations. Secondly, for multiple interests extraction, multi-head attention is conducted on top of the graph embedding. To filter out noisy inter-item correlations and enhance the robustness of extracted interests, self-supervised interest learning is introduced to the above two steps. Thirdly, to aggregate the multiple interests, interest experts corresponding to different interest routes give rating scores respectively, while a specialized network assigns the confidence of each score. Experimental results on three real-world datasets demonstrate that the proposed DemiNet significantly improves the overall recommendation performance over several state-of-the-art baselines. Further studies verify the efficacy and interpretability benefits brought from the fine-grained user interest modeling.

Why Do We Click: Visual Impression-aware News Recommendation

  • Authors: Jiahao Xun, Shengyu Zhang, Zhou Zhao, Jieming Zhu, Qi Zhang, Jingjie Li, Xiuqiang He, Xiaofei He, Tat-Seng Chua, Fei Wu
  • Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
  • Arxiv link: https://arxiv.org/abs/2109.12651
  • Pdf link: https://arxiv.org/pdf/2109.12651
  • Abstract
    There is a soaring interest in the news recommendation research scenario due to the information overload. To accurately capture users' interests, we propose to model multi-modal features, in addition to the news titles that are widely used in existing works, for news recommendation. Besides, existing research pays little attention to the click decision-making process in designing multi-modal modeling modules. In this work, inspired by the fact that users make their click decisions mostly based on the visual impression they perceive when browsing news, we propose to capture such visual impression information with visual-semantic modeling for news recommendation. Specifically, we devise the local impression modeling module to simultaneously attend to decomposed details in the impression when understanding the semantic meaning of news title, which could explicitly get close to the process of users reading news. In addition, we inspect the impression from a global view and take structural information, such as the arrangement of different fields and spatial position of different words on the impression, into the modeling of multiple modalities. To accommodate the research of visual impression-aware news recommendation, we extend the text-dominated news recommendation dataset MIND by adding snapshot impression images and will release it to nourish the research field. Extensive comparisons with the state-of-the-art news recommenders along with the in-depth analyses demonstrate the effectiveness of the proposed method and the promising capability of modeling visual impressions for the content-based recommenders.

Review of Clustering-Based Recommender Systems

  • Authors: Irina Beregovskaya, Mikhail Koroteev
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2109.12839
  • Pdf link: https://arxiv.org/pdf/2109.12839
  • Abstract
    Recommender systems are one of the most applied methods in machine learning and find applications in many areas, ranging from economics to the Internet of things. This article provides a general overview of modern approaches to recommender system design using clustering as a preliminary step to improve overall performance. Using clustering can address several known issues in recommendation systems, including increasing the diversity, consistency, and reliability of recommendations; the data sparsity of user-preference matrices; and changes in user preferences over time. This work will be useful for both beginners in the field of recommender systems and specialists in related fields that are interested in examining the applicability of recommender systems. This review is focused on the analysis of the scientific literature on the topics of recommender systems and clustering models that have appeared in recent years and contains a representative list of the literature for the further exploration of this topic. In the first part, a brief introduction to the so-called classic or traditional recommendation algorithms is given, along with an overview of the clustering problem.

How does fake news spread? Understanding pathways of disinformation spread through APIs

  • Authors: Lynnette H. X. Ng, Araz Taeihagh
  • Subjects: Computers and Society (cs.CY); Cryptography and Security (cs.CR); Human-Computer Interaction (cs.HC); Social and Information Networks (cs.SI)
  • Arxiv link: https://arxiv.org/abs/2109.12865
  • Pdf link: https://arxiv.org/pdf/2109.12865
  • Abstract
    What are the pathways for spreading disinformation on social media platforms? This article addresses this question by collecting, categorising, and situating an extensive body of research on how application programming interfaces (APIs) provided by social media platforms facilitate the spread of disinformation. We first examine the landscape of official social media APIs, then perform quantitative research on the open-source code repositories GitHub and GitLab to understand the usage patterns of these APIs. By inspecting the code repositories, we classify developers' usage of the APIs as official and unofficial, and further develop a four-stage framework characterising pathways for spreading disinformation on social media platforms. We further highlight how the stages in the framework were activated during the 2016 US Presidential Elections, before providing policy recommendations for issues relating to access to APIs, algorithmic content, advertisements, and suggest rapid response to coordinate campaigns, development of collaborative, and participatory approaches as well as government stewardship in the regulation of social media platforms.

ICMT: Item Cluster-Wise Multi-Objective Training for Long-Tail Recommendation

  • Authors: Yule Wang, Xin Xin, Yue Ding, Dong Wang
  • Subjects: Information Retrieval (cs.IR)
  • Arxiv link: https://arxiv.org/abs/2109.12887
  • Pdf link: https://arxiv.org/pdf/2109.12887
  • Abstract
    Item recommendation based on historical user-item interactions is of vital importance for web-based services. However, the data used to train a recommender system (RS) suffers from severe popularity bias. Interactions of a small fraction of popular (head) items account for almost the whole training data. Normal training methods from such biased data tend to repetitively generate recommendations from the head items, which further exacerbates the data bias and affects the exploration of potentially interesting items from niche (tail) items. In this paper, we explore the central theme of long-tail recommendation. Through an empirical study, we find that head items are very likely to be recommended due to the fact that the gradients coming from head items dominate the overall gradient update process, which further affects the optimization of tail items. To this end, we propose a general framework namely Item Cluster-Wise Multi-Objective Training (ICMT) for long-tail recommendation. Firstly, the disentangled representation learning is utilized to identify the popularity impact behind user-item interactions. Then item clusters are adaptively formulated according to the disentangled popularity representation. After that, we consider the learning over the whole training data as a weighted aggregation of multiple item cluster-wise objectives, which can be resolved through a Pareto-Efficient solver for a harmonious overall gradient direction. Besides, a contractive loss focusing on model robustness is derived as a regularization term. We instantiate ICMT with three state-of-the-art recommendation models and conduct experiments on three real-world datasets. %Through alleviating the popularity bias, Experimental results demonstrate that the proposed ICMT significantly improves the overall recommendation performance, especially on tail items.

An IIoT machine model for achieving consistency in product quality in manufacturing plants

  • Authors: Abhik Banerjee, Abdur Rahim Mohammad Forkan, Dimitrios Georgakopoulos, Josip Karabotic Milovac, Prem Prakash Jayaraman
  • Subjects: Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2109.12964
  • Pdf link: https://arxiv.org/pdf/2109.12964
  • Abstract
    Consistency in product quality is of critical importance in manufacturing. However, achieving a target product quality typically involves balancing a large number of manufacturing attributes. Existing manufacturing practices for dealing with such complexity are driven largely based on human knowledge and experience. The prevalence of manual intervention makes it difficult to perfect manufacturing practices, underscoring the need for a data-driven solution. In this paper, we present an Industrial Internet of Things (IIoT) machine model which enables effective monitoring and control of plant machinery so as to achieve consistency in product quality. We present algorithms that can provide product quality prediction during production, and provide recommendations for machine control. Subsequently, we perform an experimental evaluation of the proposed solution using real data captured from a food processing plant. We show that the proposed algorithms can be used to predict product quality with a high degree of accuracy, thereby enabling effective production monitoring and control.

Keyword: semi-supervised

Introspective Robot Perception using Smoothed Predictions from Bayesian Neural Networks

  • Authors: Jianxiang Feng, Maximilian Durner, Zoltan-Csaba Marton, Ferenc Balint-Benczedi, Rudolph Triebel
  • Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2109.12869
  • Pdf link: https://arxiv.org/pdf/2109.12869
  • Abstract
    This work focuses on improving uncertainty estimation in the field of object classification from RGB images and demonstrates its benefits in two robotic applications. We employ a (BNN), and evaluate two practical inference techniques to obtain better uncertainty estimates, namely Concrete Dropout (CDP) and Kronecker-factored Laplace Approximation (LAP). We show a performance increase using more reliable uncertainty estimates as unary potentials within a Conditional Random Field (CRF), which is able to incorporate contextual information as well. Furthermore, the obtained uncertainties are exploited to achieve domain adaptation in a semi-supervised manner, which requires less manual efforts in annotating data. We evaluate our approach on two public benchmark datasets that are relevant for robot perception tasks.

Challenging the Semi-Supervised VAE Framework for Text Classification

  • Authors: Ghazi Felhi, Joseph Le Roux, Djamé Seddah
  • Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2109.12969
  • Pdf link: https://arxiv.org/pdf/2109.12969
  • Abstract
    Semi-Supervised Variational Autoencoders (SSVAEs) are widely used models for data efficient learning. In this paper, we question the adequacy of the standard design of sequence SSVAEs for the task of text classification as we exhibit two sources of overcomplexity for which we provide simplifications. These simplifications to SSVAEs preserve their theoretical soundness while providing a number of practical advantages in the semi-supervised setup where the result of training is a text classifier. These simplifications are the removal of (i) the Kullback-Liebler divergence from its objective and (ii) the fully unobserved latent variable from its probabilistic model. These changes relieve users from choosing a prior for their latent variables, make the model smaller and faster, and allow for a better flow of information into the latent variables. We compare the simplified versions to standard SSVAEs on 4 text classification tasks. On top of the above-mentioned simplification, experiments show a speed-up of 26%, while keeping equivalent classification scores. The code to reproduce our experiments is public.

Semi-Supervised Adversarial Discriminative Domain Adaptation

  • Authors: Thai-Vu Nguyen, Anh Nguyen, Bac Le
  • Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
  • Arxiv link: https://arxiv.org/abs/2109.13016
  • Pdf link: https://arxiv.org/pdf/2109.13016
  • Abstract
    Domain adaptation is a potential method to train a powerful deep neural network, which can handle the absence of labeled data. More precisely, domain adaptation solving the limitation called dataset bias or domain shift when the training dataset and testing dataset are extremely different. Adversarial adaptation method becoming popular among other domain adaptation methods. Relies on the idea of GAN, adversarial domain adaptation tries to minimize the distribution between training and testing datasets base on the adversarial object. However, some conventional adversarial domain adaptation methods cannot handle large domain shifts between two datasets or the generalization ability of these methods are inefficient. In this paper, we propose an improved adversarial domain adaptation method called Semi-Supervised Adversarial Discriminative Domain Adaptation (SADDA), which can overcome the limitation of other domain adaptation. We also show that SADDA has better performance than other adversarial adaptation methods and illustrate the promise of our method on digit classification and emotion recognition problems.

Discovering Drug-Target Interaction Knowledge from Biomedical Literature

  • Authors: Yutai Hou, Yingce Xia, Lijun Wu, Shufang Xie, Yang Fan, Jinhua Zhu, Wanxiang Che, Tao Qin, Tie-Yan Liu
  • Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
  • Arxiv link: https://arxiv.org/abs/2109.13187
  • Pdf link: https://arxiv.org/pdf/2109.13187
  • Abstract
    The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications. As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from biomedical literature, which are usually triplets about drugs, targets and their interaction, becomes an urgent demand in the industry. Existing methods of discovering biological knowledge are mainly extractive approaches that often require detailed annotations (e.g., all mentions of biological entities, relations between every two entity mentions, etc.). However, it is difficult and costly to obtain sufficient annotations due to the requirement of expert knowledge from biomedical domains. To overcome these difficulties, we explore the first end-to-end solution for this task by using generative approaches. We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations. Further, we propose a semi-supervised method, which leverages the aforementioned end-to-end model to filter unlabeled literature and label them. Experimental results show that our method significantly outperforms extractive baselines on DTI discovery. We also create a dataset, KD-DTI, to advance this task and will release it to the community.
@seunghyunhan seunghyunhan self-assigned this Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant