您的位置:首页 > 运维架构

主题模型(topic model)的小survey

2012-04-22 15:05 274 查看
转载地址:http://hi.baidu.com/flyer_hit/blog/item/772d7cfc341a98e6fd037f54.html



链接如下:
http://net.pku.edu.cn/~zhaoxin/Topic-model-xin-zhao-wayne.pdf

文中所有内容均为我个人主观理解,论文截止到今年5月份左右。
其中肯定难免有错误、遗漏,但是我本人最近没时间再去修改了,请见谅。
分享这个survey的原因为:
1、提供一个主题模型的论文索引;
2、对于这些论文提出一些基本分类、对比和讨论;
3、希望对于初学者有所帮助。

warning:
1、请不要滥用此survey,尊重原始创作。
2、如果转帖子请标记清楚出处,前些天出现过,某个人转帖子不加原始链接,结果他的follower过来攻击我抄袭。
3、我对模型推导(Gibbs sampling除外)部分不是很了解,关于这部分问题,请不要发信给我。

Hi, All
I've collected some papers related with topic models.
For some research problems, I didn't find enough supporting papers using topic models. Pls leave a msg if you found I miss some relevant paper.
Noting, if you would like "RETWEET" this entry, please also present the link of my original entry. I did find somebody who copy my articles as his entries without any references.
===========================

l Theory
n Introduction
u Unsupervised learning by probabilistic latent semantic analysis.
u Latent dirichlet allocation.
u Finding scientific topics.
u Rethinking LDA: Why Priors Matter
u On an equivalence between PLSI and LDA
n Variations
u Correlated Topic Models.
u Hierarchical topic models and the nested Chinese restaurant process.
u Hierarchical Dirichlet processes.
u Nonparametric Bayes pachinko allocation.
u Topic Models with Power-Law Using Pitman-Yor Process
u Supervised topic models.
u Topic Models Conditioned on Arbitrary Features withDirichlet-multinomial Regression
u Discriminative Topic Modeling based on Manifold Learning
u Interactive Topic Modeling
u Mixtures of hierarchical topics with pachinko allocation
u Incorporating domain knowledge into topic modeling via DirichletForest priors
u Conditional topic random fields
u Markov random topic fields
u A two-dimensional topic-aspect model for discovering multi-facetedtopics
u Generalized component analysis for text with heterogeneousattributes
n Inference
u Gibbs Sampling:
l Finding scientific topics.
l Parameter estimation for text analysis
l Fast collapsed gibbs sampling for latent dirichlet allocation
l Distributed inference for latent dirichlet allocation
u Variational EM
l Latent dirichlet allocation.
n Evaluation
u Reading tea leaves: How humans interpret topic models.
u Evaluation Methods for Topic Models
n Online learning and scalability
u On-line LDA: Adaptive topic models for mining text streams withapplications to topic detection and tracking
u Online variational inference for the hierarchical Dirichlet process.
u Online Learning for Latent Dirichlet Allocation
u Efficient Methods for Topic Model Inference on Streaming DocumentCollections
u Online Multiscale Dynamic Topic Models

l Applications
n Classification
u DiscLDA: Discriminative learning for dimensionality reduction andclassification
u Labeled LDA: A supervised topic model for credit attribution inmulti-labeled corpora
u MedLDA: maximum margin supervised topic models for regression andclassification
n Clustering
n Network data(social network) mining
u Link-PLSA-LDA: A new unsupervised model for topics and influence ofblogs
u Connections between the lines: augmenting social networks with text
u Relational topic models for document networks
u Topic and role discovery in social networks with experiments onenron and academic email
u Group and topic discovery from relations and text
u Probabilistic models for discovering e-communities
u Arnetminer: Extraction and mining of academic social networks
u Community evolution in dynamic multi-mode networks
u An LDA-based community structure discovery approach for large-scalesocial networks
u Probabilistic community discovery using hierarchical latent gaussianmixture model
u Modeling Evolutionary Behaviors for Community-based DynamicRecommendation
u Joint group and topic discovery from relations and text
u Social topic models for community extraction
u Combining link and content for community detection: a discriminativeapproach
u Topic-Link LDA: Joint Models of Topic and Author Community
u Modeling hidden topics on document manifold
u Topic Modeling with Network Regularization
u Mining Topic-Level Influence in Heterogeneous Networks
u Utilizing Context in Generative Bayesian Models for Linked Corpus
u
n Sentiment analysis and opinion mining
u Rated aspect summarization of short comments.
u Learning document-level semantic properties from free-textannotations.
u Joint sentiment/topic model for sentiment analysis.
u Mining multi-faceted overviews of arbitrary topics in a textcollection
u Modeling online reviews with multi-grain topic models
u Topic sentiment mixture: modeling facets and opinions in weblogs.
u Multiple aspect ranking using the good grief algorithm.
u A joint model of text and aspect ratings for sentiment summarization.
u Opinion integration through semi-supervised topic modeling
u Holistic Sentiment Analysis Across Languages: MultilingualSupervised Latent Dirichlet Allocation.
u Latent Aspect Rating Analysis on Review Text Data: A RatingRegression Approach
u Aspect and Sentiment Unification Model for Online Review Analysis
u An unsupervised aspect-sentiment model for online reviews
u Jointly Modeling Aspects and Opinions with a MaxEnt-LDA Hybrid.

n Evolutionary text stream mining
u Discovering evolutionary theme patterns from text: an exploration oftemporal text mining
u Topics over time: a non-markov continuous-time model of topicaltrends
u Topic models over text streams: A study of batch and onlineunsupervised learning
u Mining correlated bursty topic patterns from coordinated textstreams
u Topic Evolution in a stream of Documents
u Evolutionary Hierarchical Dirichlet Processes for MultipleCorrelated Time-varying Corpora
u Studying the history of ideas using topic models
u Mining common topics from multiple asynchronous text streams.
u Mining Correlated Bursty Topic Patterns from Coordinated TextStreams

n Temporal and spatial data analysis
u A latent variable model for geographic lexical variation.
u Dynamic topic models
u A probabilistic approach to spatiotemporal theme pattern mining onweblogs
u Continuous time dynamic topic models
u Dynamic mixture models for multiple time series
u On-Line LDA: Adaptive Topic Models for Mining Text Streams
u Topic models over text streams: A study of batch and onlineunsupervised learning
u Spatial latent dirichlet allocation
n Scientific publication mining
u Finding scientific topics.
u The author-topic model for authors and documents.
u Statistical entity-topic models
u Probabilistic author-topic models for information discovery
u The author-recipient-topic model for topic and role discovery insocial networks
u Expertise modeling for matching papers with reviewers
u Topic evolution and social interactions: how authors effect research
u Joint latent topic models for text and citations
u Co-ranking authors and documents in a heterogeneous network
u Mixed-membership models of scientific publications
u Modeling individual differences using Dirichlet processes
u Multi-aspect expertise matching for review assignment
u Topic-link LDA: joint models of topic and author community
u Group and topic discovery from relations and their attributes
u Exploiting Temporal Authors Interests via Temporal-Author-TopicModeling, ADMA 2009
u Topic and Trend Detection in Text Collections Using Latent DirichletAllocation, ECIR 2009
u Mining a digital library for influential authors.
u Bibliometric Impact Measures Leveraging Topic Analysis.
u Context-aware Citation Recommendation
u Detecting Topic Evolution in Scientific Literature: How CanCitations Help?
u Latent Interest-Topic Model: Finding the causal relationships behinddyadic data
u A topic modeling approach and its integration into the random walkframework for academic search
n Web data mining
u Latent topic models for hypertext
n Information retrieval
u LDA-based document models for ad-hoc retrieval
u Exploring social annotations for information retrieval
u Modeling general and specific aspects of documents with a probabilistictopic model
u Exploring topic-based language models for effective web informationretrieval
u Probabilistic Models for Expert Finding
n Information extraction
u Employing Topic Models for Pattern-based Semantic Class Discovery
u Combining Concept Hierarchies and Statistical Topic Models
u A Probabilistic Approach for Adapting Information ExtractionWrappers and Discovering New Attributes
u An Unsupervised Framework for Extracting and Normalizing ProductAttributes from Multiple Web Sites
u Learning to Adapt Web Information Extraction Knowledge andDiscovering New Attributes via a Bayesian Approach
u Adapting Web Information Extraction Knowledge via Mining SiteInvariant and Site Dependent Features
u Learning to Extract and Summarize Hot Item Features from MultipleAuction Web Sites"
u Semi-supervised Extraction of Entity Aspects Using Topic Models

n Annotations(or Tagging, Labeling) and
recommendation

u Automatic labeling of multinomial topic models.
u Context modeling for ranking and tagging bursty features in textstreams.
u Learning document-level semantic properties from free-textannotations.
u Generating summary keywords for emails using topics
u Semantic Annotation of Frequent Patterns
u Latent dirichlet allocation for tag recommendation
u Tag-LDA for Scalable Real-time Tag Recommendation
u The Topic-Perspective Model for Social Tagging Systems
u A Probabilistic Topic-Connection Model for Automatic ImageAnnotation
u Clustering the Tagged Web
n Summarization
u Topical keyphrase extraction from twitter.
u Bayesian query-focused summarization
u Topic-based multi-document summarization with probabilistic latentsemantic analysis
u Multi-topic based Query-oriented Summarization
u Multi-Document Summarization using Sentence-based Topic Models
u Generating Impact-Based Summaries for Scientific Literature
u Generating Comparative Summaries of Contradictory Opinions in Text
u Rated Aspect Summarization of Short Comments
u A Hybrid Hierarchical Model for Multi-Document Summarization
u GENERATING TEMPLATES OF ENTITY SUMMARIES WITH AN ENTITY-ASPECT MODELAND PATTERN MINING
u Latent dirichlet allocation and singular value decomposition basedmulti-document summarization
n Social media mining
u A latent variable model for geographic lexical variation.
u Empirical study of topic modeling in twitter.
u Characterizing micorblogs with topic models.
u TwitterRank: finding topic-sensitive influential twitterers.
u Comparing twitter and traditional media using topic models.
n DB
u Topic cube: Topic modeling for olap on multidimensional textdatabases
n NLP tasks
u A topic model for word sense disambiguation
u Syntactic topic models
u Integrating topics and syntax
u Topic modeling: beyond bag-of-words
u A Bayesian LDA-based model for semi-supervised part-of-speechtagging
u Topical n-grams: Phrase and topic discovery, with an application toinformation retrieval
u A topic model for word sense disambiguation
u Named entity recognition in query
u Multilingual topic models for unaligned text.
u Markov topic models.
u Modeling Syntactic Structures of Topics with a Nested HMM-LDA
u Topic segmentation with an aspect hidden Markov model.
u Polylingual Topic Models
u A Latent Dirichlet Allocation method for Selectional Preferences
u Improving word sense disambiguation using topic features
u Cross-Lingual Latent Topic Extraction
u Exploiting conversation structure in unsupervised topic segmentationfor emails
u TOPIC MODELS FOR WORD SENSE DISAMBIGUATION AND TOKEN-BASED IDIOM
DETECTION
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: