您的位置:首页 > 其它

Conditional Random Fields(CRF)

2011-05-23 16:39 495 查看
原文:http://www.inference.phy.cam.ac.uk/hmw26/crf/

写的不错,有空翻译过来。

This page contains material on, or relating to, conditional randomfields. I
shall continue to update this page as research onconditional random fields
advances, so do check back periodically. Ifyou feel there is something that
should be on here but isn't, thenplease email me (hmw26 -at- srcf.ucam.org) and
let me know.

introduction

Conditional random fields (CRFs) are a probabilistic framework forlabeling
and segmenting structured data, such as sequences, trees andlattices. The
underlying idea is that of defining a conditionalprobability distribution over
label sequences given a particularobservation sequence, rather than a joint
distribution over both labeland observation sequences. The primary advantage of
CRFs over hiddenMarkov models is their conditional nature, resulting in the
relaxationof the independence assumptions required by HMMs in order to
ensuretractable inference. Additionally, CRFs avoid the label bias problem,a
weakness exhibited by maximum entropy Markov models (MEMMs) andother conditional
Markov models based on directed graphicalmodels. CRFs outperform both MEMMs and
HMMs on a number of real-worldtasks in many fields, including bioinformatics,
computationallinguistics and speech recognition.

tutorial

Hanna M. Wallach. ConditionalRandom
Fields: An Introduction. Technical ReportMS-CIS-04-21. Department of
Computer and Information Science,University of Pennsylvania, 2004.

papers by year

2001

John Lafferty, Andrew McCallum, Fernando Pereira. ConditionalRandom
Fields: Probabilistic Models for Segmenting and LabelingSequence Data. In
Proceedings of the Eighteenth InternationalConference on Machine Learning
(ICML-2001), 2001.

We present conditional random fields, a framework for
buildingprobabilistic models to segment and label sequence data.
Conditionalrandom fields offer several advantages over hidden Markov models
andstochastic grammars for such tasks, including the ability to relaxstrong
independence assumptions made in those models. Conditionalrandom fields also
avoid a fundamental limitation of maximum entropyMarkov models (MEMMs) and other
discriminative Markov models based ondirected graphical models, which can be
biased towards states with fewsuccessor states. We present iterative parameter
estimation algorithmsfor conditional random fields and compare the performance
of theresulting models to HMMs and MEMMs on synthetic and
natural-languagedata.

2002

Hanna Wallach. EfficientTraining
of Conditional Random Fields. M.Sc. thesis, Division ofInformatics,
University of Edinburgh, 2002.

This thesis explores a number of parameter estimation techniques
forconditional random fields, a recently introduced probabilistic modelfor
labelling and segmenting sequential data. Theoretical andpractical disadvantages
of the training techniques reported in currentliterature on CRFs are discussed.
We hypothesise that generalnumerical optimisation techniques result in improved
performance overiterative scaling algorithms for training CRFs. Experiments run
on asubset of a well-known text chunking data set confirm that this isindeed the
case. This is a highly promising result, indicating thatsuch parameter
estimation techniques make CRFs a practical andefficient choice for labelling
sequential data, as well as atheoretically sound and principled probabilistic
framework.
Thomas G. Dietterich. MachineLearning
for Sequential Data: A Review. In Structural,Syntactic, and Statistical
Pattern Recognition; Lecture Notes inComputer Science, Vol. 2396, T. Caelli
(Ed.), pp. 15–30,Springer-Verlag, 2002.

Statistical learning problems in many fields involve sequentialdata.
This paper formalizes the principal learning tasks and describesthe methods that
have been developed within the machine learningresearch community for addressing
these problems. These methodsinclude sliding window methods, recurrent sliding
windows, hiddenMarkov models, conditional random fields, and graph
transformernetworks. The paper also discusses some open research
issues.

2003

Fei Sha and Fernando Pereira. ShallowParsing with
Conditional Random Fields. In Proceedings of the2003 Human Language
Technology Conference and North American Chapterof the Association for
Computational Linguistics (HLT/NAACL-03),2003.

Conditional random fields for sequence labeling offer advantages
overboth generative models like HMMs and classifers applied at eachsequence
position. Among sequence labeling tasks in languageprocessing, shallow parsing
has received much attention, with thedevelopment of standard evaluation datasets
and extensive comparisonamong methods. We show here how to train a conditional
random field toachieve performance as good as any reported base noun-phrase
chunkingmethod on the CoNLL task, and better than any reported singlemodel.
Improved training methods based on modern optimizationalgorithms were critical
in achieving these results. We presentextensive comparisons between models and
training methods that confirmand strengthen previous results on shallow parsing
and trainingmethods for maximum-entropy models.
Andrew McCallum. EfficientlyInducing
Features of Conditional Random Fields. In Proceedingsof the 19th
Conference in Uncertainty in Articifical Intelligence(UAI-2003), 2003.

Conditional Random Fields (CRFs) are undirected graphical models,
aspecial case of which correspond to conditionally-trained finite statemachines.
A key advantage of CRFs is their great flexibility toinclude a wide variety of
arbitrary, non-independent features of theinput. Faced with this freedom,
however, an important questionremains: what features should be used? This paper
presents anefficient feature induction method for CRFs. The method is founded
onthe principle of iteratively constructing feature conjunctions thatwould
significantly increase conditional log-likelihood if added tothe model.
Automated feature induction enables not only improvedaccuracy and dramatic
reduction in parameter count, but also the useof larger cliques, and more
freedom to liberally hypothesize atomicinput variables that may be relevant to a
task. The method applies tolinear-chain CRFs, as well as to more arbitrary CRF
structures, suchas Relational Markov Networks, where it corresponds to learning
cliquetemplates, and can also be understood as supervised structurelearning.
Experimental results on named entity extraction and nounphrase segmentation
tasks are presented.
David Pinto, Andrew McCallum, Xing Wei and W. Bruce Croft. TableExtraction
Using Conditional Random Fields. In Proceedings ofthe 26th Annual
International ACM SIGIR Conference on Research andDevelopment in Information
Retrieval (SIGIR 2003), 2003.

The ability to find tables and extract information from them is
anecessary component of data mining, question answering, and otherinformation
retrieval tasks. Documents often contain tables in orderto communicate densely
packed, multi-dimensional information. Tablesdo this by employing layout
patterns to e ciently indicate fields andrecords in two-dimensional form. Their
rich combination of formattingand content present di culties for traditional
language modelingtechniques, however. This paper presents the use of conditional
randomfields (CRFs) for table extraction, and compares them with hiddenMarkov
models (HMMs). Unlike HMMs, CRFs support the use of many richand overlapping
layout and language features, and as a result, theyperform significantly better.
We show experimental results onplain-text government statistical reports in
which tables are locatedwith 92% F1, and their constituent lines are classified
into 12table-related categories with 94% accuracy. We also discuss futurework on
undirected graphical models for segmenting columns, findingcells, and
classifying them as data cells or label cells.
Andrew McCallum and Wei Li. Early Resultsfor Named
Entity Recognition with Conditional Random Fields, FeatureInduction and
Web-Enhanced Lexicons. In Proceedings of theSeventh Conference on Natural
Language Learning (CoNLL), 2003.

Wei Li and Andrew McCallum. RapidDevelopment
of Hindi Named Entity Recognition Using Conditional RandomFields and Feature
Induction. In ACM Transactions on AsianLanguage Information
Processing (TALIP), 2003.

This paper describes our application of conditional random fields
withfeature induction to a Hindi named entity recognition task. With onlyfive
days development time and little knowledge of this language, weautomatically
discover relevant features by providing a large array oflexical tests and using
feature induction to automatically constructthe features that most increase
conditional likelihood. In an effortto reduce overfitting, we use a combination
of a Gaussian prior andearly stopping based on the results of 10-fold cross
validation.
Yasemin Altun and Thomas Hofmann. LargeMargin
Methods for Label Sequence Learning. In Proceedings of8th European
Conference on Speech Communication and Technology(EuroSpeech), 2003.

Label sequence learning is the problem of inferring a state
sequencefrom an observation sequence, where the state sequence may encode
alabeling, annotation or segmentation of the sequence. In this paper wegive an
overview of discriminative methods developed for thisproblem. Special emphasis
is put on large margin methods bygeneralizing multiclass Support Vector Machines
and AdaBoost to thecase of label sequences.An experimental evaluation
demonstrates theadvantages over classical approaches like Hidden Markov Models
and thecompetitiveness with methods like Conditional Random Fields.
Simon Lacoste-Julien. CombiningSVM
with graphical models for supervised classification: anintroduction to
Max-Margin Markov Networks. CS281A Project Report,UC Berkeley, 2003.

The goal of this paper is to present a survey of the concepts
neededto understand the novel Max-Margin Markov Networks
(M3-net)framework, a new formalism invented by Taskar, Guestrin and
Kollerwhich combines both the advantages of the graphical models and theSupport
Vector Machines (SVMs) to solve the problem of multi-labelmulti-class supervised
classification. We will compare generativemodels, discriminative graphical
models and SVMs for this task,introducing the basic concepts at the same time,
leading at the end toa presentation of the M3-net paper.

2004

Andrew McCallum, Khashayar Rohanimanesh and Charles Sutton. DynamicConditional
Random Fields for Jointly Labeling Multiple Sequences.Workshop on Syntax,
Semantics, Statistics; 16th Annual Conference onNeural Information Processing
Systems (NIPS 2003), 2004.

Conditional random fields (CRFs) for sequence modeling have
severaladvantages over joint models such as HMMs, including the ability torelax
strong independence assumptions made in those models, and theability to
incorporate arbitrary overlapping features. Previous workhas focused on
linear-chain CRFs, which correspond to finite-statemachines, and have efficient
exact inference algorithms. Often,however, we wish to label sequence data in
multiple interactingways—for example, performing part-of-speech tagging and
nounphrase segmentation simultaneously, increasing joint accuracy bysharing
information between them. We present dynamic conditionalrandom fields (DCRFs),
which are CRFs in which each time slice has aset of state variables and edges—a
distributed staterepresentation as in dynamic Bayesian networks—and
parametersare tied across slices. (They could also be
calledconditionally-trained Dynamic Markov Networks.) Since exact inferencecan
be intractable in these models, we perform approximate inferenceusing the
tree-based reparameterization framework (TRP). We alsopresent empirical results
comparing DCRFs with linear-chain CRFs onnatural-language data.
Kevin Murphy, Antonio Torralba and William T.F. Freeman. Using the forestto see the
trees: a graphical model relating features, objects andscenes. In
Advances in Neural Information Processing Systems16 (NIPS 2003),
2004.

Standard approaches to object detection focus on local patches of
theimage, and try to classify them as background or not. We propose touse the
scene context (image as a whole) as an extra sourceof (global)
information, to help resolve local ambiguities. We presenta conditional random
field for jointly solving the tasks of objectdetection and scene
classification.
Sanjiv Kumar and Martial Hebert. DiscriminativeFields for
Modeling Spatial Dependencies in Natural Images. InAdvances in Neural
Information Processing Systems 16 (NIPS2003), 2004.

In this paper we present Discriminative Random Fields (DRF),
adiscriminative framework for the classification of natural imageregions by
incorporating neighborhood spatial dependencies in thelabels as well as the
observed data. The proposed model exploits localdiscriminative models and allows
to relax the assumption ofconditional independence of the observed data given
the labels,commonly used in the Markov Random Field (MRF) framework.
Theparameters of the DRF model are learned using penalized
maximumpseudo-likelihood method. Furthermore, the form of the DRF modelallows
the MAP inference for binary classification problems using thegraph min-cut
algorithms. The performance of the model was verified onthe synthetic as well as
the real-world images. The DRF modeloutperforms the MRF model in the
experiments.
Ben Taskar, Carlos Guestrin and Daphne Koller. Max-MarginMarkov
Networks. In Advances in Neural Information ProcessingSystems 16
(NIPS 2003), 2004.

In typical classification tasks, we seek a function which assigns
alabel to a single object. Kernel-based approaches, such as supportvector
machines (SVMs), which maximize the margin of confidence of theclassifier, are
the method of choice for many such tasks. Theirpopularity stems both from the
ability to use high-dimensional featurespaces, and from their strong theoretical
guarantees. However, manyreal-world tasks involve sequential, spatial, or
structured data,where multiple labels must be assigned. Existing kernel-based
methodsignore structure in the problem, assigning labels independently toeach
object, losing much useful information. Conversely, probabilisticgraphical
models, such as Markov networks, can represent correlationsbetween labels, by
exploiting problem structure, but cannot handlehigh-dimensional feature spaces,
and lack strong theoreticalgeneralization guarantees. In this paper, we present
a new frameworkthat combines the advantages of both approaches: Maximum margin
Markov(M3) networks incorporate both kernels, which efficientlydeal
with high-dimensional features, and the ability to capturecorrelations in
structured data. We present an efficient algorithm forlearning M3
networks based on a compact quadratic programformulation. We provide a new
theoretical bound for generalization instructured domains. Experiments on the
task of handwritten characterrecognition and collective hypertext classification
demonstrate verysignificant gains over previous approaches.
Burr Settles. BiomedicalNamed
Entity Recognition Using Conditional Random Fields and RichFeature Sets. To
appear in Proceedings of the InternationalJoint Workshop on Natural Language
Processing in Biomedicine and itsApplications (NLPBA), 2004.

A demo of the system can be downloaded here.

As the wealth of biomedical knowledge in the form of
literatureincreases, there is a rising need for effective natural
languageprocessing tools to assist in organizing, curating, and retrievingthis
information. To that end, named entity recognition (the task ofidentifying words
and phrases in free text that belong to certainclasses of interest) is an
important first step for many of theselarger information management goals. In
recent years, much attentionhas been focused on the problem of recognizing gene
and proteinmentions in biomedical abstracts. This paper presents a framework
forsimultaneously recognizing occurrences of PROTEIN, DNA, RNA,CELL-LINE, and
CELL-TYPE entity classes using Conditional RandomFields with a variety of
traditional and novel features. I show thatthis approach can achieve an overall
F measure around 70, which seemsto be the current state of the art.
Charles Sutton, Khashayar Rohanimanesh and Andrew McCallum. DynamicConditional
Random Fields: Factorized Probabilistic Models forLabeling and Segmenting
Sequence Data. In Proceedings of theTwenty-First International Conference
on Machine Learning (ICML2004), 2004.

In sequence modeling, we often wish to represent complex
interactionbetween labels, such as when performing multiple, cascaded
labelingtasks on the same sequence, or when long-range dependencies exist.
Wepresent dynamic conditional random fields (DCRFs), a generalization
oflinear-chain conditional random fields (CRFs) in which each time slicecontains
a set of state variables and edges—a distributed staterepresentation as in
dynamic Bayesian networks (DBNs)—andparameters are tied across slices. Since
exact inference can beintractable in such models, we perform approximate
inference usingseveral schedules for belief propagation, including
tree-basedreparameterization (TRP). On a natural-language chunking task, we
showthat a DCRF performs better than a series of linear-chain CRFs,achieving
comparable performance using only half the training data.
John Lafferty, Xiaojin Zhu and Yan Liu. Kernelconditional
random fields: representation and clique selection. InProceedings of the
Twenty-First International Conference on MachineLearning (ICML 2004),
2004.

Kernel conditional random fields (KCRFs) are introduced as a
frameworkfor discriminative modeling of graph-structured data. A
representertheorem for conditional graphical models is given which shows
howkernel conditional random fields arise from risk minimizationprocedures
defined using Mercer kernels on labeled graphs. A procedurefor greedily
selecting cliques in the dual representation is thenproposed, which allows
sparse representations. By incorporatingkernels and implicit feature spaces into
conditional graphical models,the framework enables semi-supervised learning
algorithms forstructured data through the use of graph kernels. The framework
andclique selection methods are demonstrated in synthetic dataexperiments, and
are also applied to the problem of protein secondarystructure
prediction.
Xuming He, Richard Zemel, and MiguelÁ. Carreira-Perpiñán. Multiscaleconditional
random fields for image labelling. In Proceedingsof the 2004 IEEE
Computer Society Conference on Computer Vision andPattern Recognition (CVPR
2004), 2004.

We propose an approach to include contextual features for
labelingimages, in which each pixel is assigned to one of a finite set oflabels.
The features are incorporated into a probabilistic frameworkwhich combines the
outputs of several components. Components differ inthe information they encode.
Some focus on the image-label mapping,while others focus solely on patterns
within the labelfield. Components also differ in their scale, as some focus
onfine-resolution patterns while others on coarser, more globalstructure. A
supervised version of the contrastive divergencealgorithm is applied to learn
these features from labeled imagedata. We demonstrate performance on two
real-world image databases andcompare it to a classifier and a Markov random
field.
Yasemin Altun, Alex J. Smola, Thomas Hofmann. ExponentialFamilies
for Conditional Random Fields. In Proceedings of the20th Conference on
Uncertainty in Artificial Intelligence(UAI-2004), 2004.

In this paper we define conditional random fields in
reproducingkernel Hilbert spaces and show connections to Gaussian
Processclassification. More specifically, we prove decomposition results
forundirected graphical models and we give constructions forkernels. Finally we
present efficient means of solving theoptimization problem using reduced rank
decompositions and we show howstationarity can be exploited efficiently in the
optimization process.
Michelle L. Gregory and Yasemin Altun. UsingConditional
Random Fields to Predict Pitch Accents in ConversationalSpeech. In
Proceedings of the 42nd Annual Meeting ofthe Association for
Computational Linguistics (ACL 2004),2004.

The detection of prosodic characteristics is an important aspect
ofboth speech synthesis and speech recognition. Correct placement ofpitch
accents aids in more natural sounding speech, while automaticdetection of
accents can contribute to better word-level recognitionand better textual
understanding. In this paper we investigateprobabilistic, contextual, and
phonological factors that influencepitch accent placement in natural,
conversational speech in a sequencelabeling setting. We introduce Conditional
Random Fields (CRFs) topitch accent prediction task in order to incorporate
these factorsefficiently in a sequence model. We demonstrate the usefulness and
theincremental effect of these factors in a sequence model by
performingexperiments on hand labeled data from the Switchboard Corpus. Ourmodel
outperforms the baseline and previous models of pitch accentprediction on the
Switchboard Corpus.
Brian Roark, Murat Saraclar, Michael Collins and Mark Johnson. DiscriminativeLanguage
Modeling with Conditional Random Fields and the PerceptronAlgorithm. In
Proceedings of the 42nd Annual Meetingof the Association for
Computational Linguistics (ACL 2004),2004.

This paper describes discriminative language modeling for a
largevocabulary speech recognition task. We contrast two parameterestimation
methods: the perceptron algorithm, and a method based onconditional random
fields (CRFs). The models are encoded asdeterministic weighted finite state
automata, and are applied byintersecting the automata with word-lattices that
are the output froma baseline recognizer. The perceptron algorithm has the
benefit ofautomatically selecting a relatively small feature set in just acouple
of passes over the training data. However, using the featureset output from the
perceptron algorithm (initialized with theirweights), CRF training provides an
additional 0.5% reduction in worderror rate, for a total 1.8% absolute reduction
from the baseline of39.2%.
Ryan McDonald and Fernando Pereira. IdentifyingGene
and Protein Mentions in Text Using Conditional Random Fields.BioCreative,
2004.

Trausti T. Kristjansson, Aron Culotta, Paul Viola and AndrewMcCallum. InteractiveInformation
Extraction with Constrained Conditional Random Fields.In Proceedings of
the Nineteenth National Conference on ArtificialIntelligence (AAAI 2004),
2004.

Information Extraction methods can be used to automatically
"fill-in"database forms from unstructured data such as Web documents oremail.
State-of-the-art methods have achieved low error rates butinvariably make a
number of errors. The goal of an interactiveinformation extraction system is to
assist the user in filling indatabase fields while giving the user confidence in
the integrity ofthe data. The user is presented with an interactive interface
thatallows both the rapid verification of automatic field assignments andthe
correction of errors. In cases where there are multiple errors,our system takes
into account user corrections, and immediatelypropagates these constraints such
that other fields are oftencorrected automatically. Linear-chain conditional
random fields (CRFs) have been shown toperform well for information extraction
and other language modellingtasks due to their ability to capture arbitrary,
overlapping featuresof the input in a Markov model. We apply this framework with
twoextensions: a constrained Viterbi decoding which finds the optimalfield
assignments consistent with the fields explicitly specified orcorrected by the
user; and a mechanism for estimating the confidenceof each extracted field, so
that low-confidence extractions can behighlighted. Both of these mechanisms are
incorporated in a novel userinterface for form filling that is intuitive and
speeds the entry ofdata—providing a 23% reduction in error due to
automatedcorrections.
Thomas G. Dietterich, Adam Ashenfelter and Yaroslav Bulatov. TrainingConditional
Random Fields via Gradient Tree Boosting. InProceedings of the
Twenty-First International Conference on MachineLearning (ICML 2004),
2004.

Conditional Random Fields (CRFs; Lafferty, McCallum, & Pereira,
2001)provide a flexible and powerful model for learning to assign labels
toelements of sequences in such applications as part-of-speech
tagging,text-to-speech mapping, protein and DNA sequence analysis,
andinformation extraction from web pages. However, existing learningalgorithms
are slow, particularly in problems with large numbers ofpotential input
features. This paper describes a new method fortraining CRFs by applying
Friedman's (1999) gradient tree boostingmethod. In tree boosting, the CRF
potential functions are representedas weighted sums of regression trees.
Regression trees are learned bystage-wise optimizations similar to Adaboost, but
with the objectiveof maximizing the conditional likelihood P(Y|X) of the
CRFmodel. By growing regression trees, interactions among features areintroduced
only as needed, so although the parameter space ispotentially immense, the
search algorithm does not explicitly considerthe large space. As a result,
gradient tree boosting scales linearlyin the order of the Markov model and in
the order of the featureinteractions, rather than exponentially like previous
algorithms basedon iterative scaling and gradient descent.
John Lafferty, Yan Liu and Xiaojin Zhu. KernelConditional
Random Fields: Representation, Clique Selection, andSemi-Supervised
Learning. Technical Report CMU-CS-04-115, CarnegieMellon University,
2004.

Kernel conditional random fields are introduced as a framework
fordiscriminative modeling of graph-structured data. A representertheorem for
conditional graphical models is given which shows howkernel conditional random
fields arise from risk minimizationprocedures defined using Mercer kernels on
labeled graphs. A procedurefor greedily selecting cliques in the dual
representation is thenproposed, which allows sparse representations. By
incorporatingkernels and implicit feature spaces into conditional graphical
models,the framework enables semi-supervised learning algorithms forstructured
data through the use of graph kernels. The clique selectionand semi-supervised
methods are demonstrated in synthetic dataexperiments, and are also applied to
the problem of protein secondarystructure prediction.
Fuchun Peng and Andrew McCallum (2004). AccurateInformation
Extraction from Research Papers using Conditional RandomFields. In
Proceedings of Human Language Technology Conferenceand North American Chapter
of the Association for ComputationalLinguistics (HLT/NAACL-04), 2004.

With the increasing use of research paper search engines, such
asCiteSeer, for both literature search and hiring decisions, theaccuracy of such
systems is of paramount importance. This paperemploys Conditional Random Fields
(CRFs) for the task of extractingvarious common fields from the headers and
citation of researchpapers. The basic theory of CRFs is becoming
well-understood, butbest-practices for applying them to real-world data
requiresadditional exploration. This paper makes an empirical exploration
ofseveral factors, including variations on Gaussian, exponential andhyperbolic
priors for improved regularization, and several classes offeatures and Markov
order. On a standard benchmark data set, weachieve new state-of-the-art
performance, reducing error in average F1by 36%, and word error rate by 78% in
comparison with the previousbest SVM results. Accuracy compares even more
favorably against HMMs.
Yasemin Altun, Thomas Hofmann and Alexander J. Smola. Gaussianprocess
classification for segmenting and annotating sequences. InProceedings of
the Twenty-First International Conference on MachineLearning (ICML 2004),
2004.

Many real-world classification tasks involve the prediction
ofmultiple, inter-dependent class labels. A prototypical case of thissort deals
with prediction of a sequence of labels for a sequence ofobservations. Such
problems arise naturally in the context ofannotating and segmenting observation
sequences. This papergeneralizes Gaussian Process classification to predict
multiple labelsby taking dependencies between neighboring labels into account.
Ourapproach is motivated by the desire to retain rigorous
probabilisticsemantics, while overcoming limitations of parametric methods
likeConditional Random Fields, which exhibit conceptual and
computationaldifficulties in high-dimensional input spaces. Experiments on
namedentity recognition and pitch accent prediction tasks demonstrate
thecompetitiveness of our approach.
Yasemin Altun and Thomas Hofmann. GaussianProcess
Classification for Segmenting and Annotating Sequences.Technical Report
CS-04-12, Department of Computer Science, BrownUniversity, 2004.

Multiclass classification refers to the problem of assigning labels
toinstances where labels belong to some finite set of elements. Often,however,
the instances to be labeled do not occur in isolation, butrather in observation
sequences. One is then interested in predictingthe joint label configuration,
i.e. the sequence of labels, usingmodels that take possible interdependencies
between label variablesinto account. This scenario subsumes problems of sequence
segmentationand annotation. In this paper, we investigate the use of
GaussianProcess (GP) classification for label sequences.

2005

Cristian Smimchisescu, Atul Kanaujia, Zhiguo Li and DimitrisMetaxus. ConditionalModels
for Contextual Human Motion Recognition. In Proceedingsof the
International Conference on Computer Vision, (ICCV 2005),Beijing, China,
2005.

We present algorithms for recognizing human motion inmonocular video
sequences, based on discriminative Conditional RandomField (CRF) and Maximum
Entropy Markov Models (MEMM). Existingapproaches to this problem typically use
generative (joint) structureslike the Hidden Markov Model (HMM). Therefore they
have to makesimplifying, often unrealistic assumptions on the
conditionalindependence of observations given the motion class labels and
cannotaccommodate overlapping features or long term contextual dependenciesin
the observation sequence. In contrast, conditional models like theCRFs
seamlessly represent contextual dependencies, support efficient,exact inference
using dynamic programming, and their parameters can betrained using convex
optimization. We introduce conditional graphicalmodels as complementary tools
for human motion recognition and presentan extensive set of experiments that
show how these typicallyoutperform HMMs in classifying not only diverse human
activities likewalking, jumping, running, picking or dancing, but also
fordiscriminating among subtle motion styles like normal walk and
wanderwalk.
Ariadna Quattoni, Michael Collins and Trevor Darrel. Conditional
Random Fields for Object Recognition. In Advancesin Neural Information
Processing Systems 17 (NIPS 2004), 2005.

We present a discriminative part-based approach for the recognition
ofobject classes from unsegmented cluttered scenes. Objects are modeledas
flexible constellations of parts conditioned on local observationsfound by an
interest operator. For each object class the probabilityof a given assignment of
parts to local features is modeled by aConditional Random Field (CRF). We
propose an extension of the CRFframework that incorporates hidden variables and
combines classconditional CRFs into a unified framework for part-based
objectrecognition. The parameters of the CRF are estimated in a
maximumlikelihood framework and recognition proceeds by finding the mostlikely
class under our model. The main advantage of the proposed CRFframework is that
it allows us to relax the assumption of conditionalindependence of the observed
data (i.e. local features) often used ingenerative approaches, an assumption
that might be too restrictive fora considerable number of object classes. We
illustrate the potentialof the model in the task of recognizing cars from rear
and side views.
Jospeh Bockhorst and Mark Craven. MarkovNetworks for
Detecting Overlapping Elements in Sequence Data. InAdvances in Neural
Information Processing Systems 17 (NIPS2004), 2005.

Many sequential prediction tasks involve locating instances of
pat-terns in sequences. Generative probabilistic language models, such ashidden
Markov models (HMMs), have been successfully applied to many ofthese tasks. A
limitation of these models however, is that they cannotnaturally handle cases in
which pattern instances overlap in arbitraryways. We present an alternative
approach, based on conditional Markovnetworks, that can naturally represent
arbitrarily overlappingelements. We show how to efficiently train and perform
inference withthese models. Experimental results from a genomics domain show
thatour models are more accurate at locating instances of overlappingpatterns
than are baseline models based on HMMs.
Antonio Torralba, Kevin P. Murphy, William T. Freeman. Contextualmodels
for object detection using boosted random fields. InAdvances in Neural
Information Processing Systems 17 (NIPS2004), 2005.

We seek to both detect and segment objects in images. To exploit
bothlocal image data as well as contextual information, we introduceBoosted
Random Fields (BRFs), which uses Boosting to learn the graphstructure and local
evidence of a conditional random field (CRF). Thegraph structure is learned by
assembling graph fragments in anadditive model. The connections between
individual pixels are not veryinformative, but by using dense graphs, we can
pool information fromlarge regions of the image; dense models also support
efficientinference. We show how contextual information from other objects
canimprove detection performance, both in terms of accuracy and speed, byusing a
computational cascade. We apply our system to detect stuff andthings in office
and street scenes.
Sunita Sarawagi and William W. Cohen. Semi-MarkovConditional
Random Fields for Information Extraction. InAdvances in Neural
Information Processing Systems 17 (NIPS2004), 2005.

We describe semi-Markov conditional random fields (semi-CRFs),
aconditionally trained version of semi-Markov chains. Intuitively, asemi-CRF on
an input sequence x outputs a "segmentation" of x, inwhich labels are assigned
to segments (i.e., subsequences) of x ratherthan to individual elements
xi ofx. Importantly, features for semi-CRFs can measure
propertiesof segments, and transitions within a segment can be non-Markovian.
Inspite of this additional power, exact learning and inferencealgorithms for
semi-CRFs are polynomial-time—often only a smallconstant factor slower than
conventional CRFs. In experiments on fivenamed entity recognition problems,
semi-CRFs generally outperformconventional CRFs.
Yuan Qi, Martin Szummer and Thomas P. Minka. BayesianConditional
Random Fields. To appear in Proceedings of theTenth International
W/orkshop on Artificial Intelligence andStatistics (AISTATS 2005),
2005.

We propose Bayesian Conditional Random Fields (BCRFs) for
classifyinginterdependent and structured data, such as sequences, images orwebs.
BCRFs are a Bayesian approach to training and inference withconditional random
fields, which were previously trained by maximizinglikelihood (ML) (Lafferty et
al., 2001). Our framework eliminates theproblem of overfitting, and offers the
full advantages of a Bayesiantreatment. Unlike the ML approach, we estimate the
posteriordistribution of the model parameters during training, and average
overthis posterior during inference. We apply an extension of EP method,the
power EP method, to incorporate the partition function. Foralgorithmic stability
and accuracy, we flatten the approximationstructures to avoid two-level
approximations. We demonstrate thesuperior prediction accuracy of BCRFs over
conditional random fieldstrained with ML or MAP on synthetic and real
datasets.
Aron Culotta, David Kulp and Andrew McCallum. GenePrediction with
Conditional Random Fields. Technical ReportUM-CS-2005-028. University of
Massachusetts, Amherst, 2005.

Given a sequence of DNA nucleotide bases, the task of gene
predictionis to find subsequences of bases that encode proteins.
Reasonableperformance on this task has been achieved using generatively
trainedsequence models, such as hidden Markov models. We propose instead theuse
of a discriminitively trained sequence model, the conditionalrandom field (CRF).
CRFs can naturally incorporate arbitrary,non-independent features of the input
without making conditionalindependence assumptions among the features. This can
be particularlyimportant for gene finding, where including evidence from
proteindatabases, EST data, or tiling arrays may improve accuracy. We eval-uate
our model on human genomic data, and show that CRFs performbetter than HMM-based
models at incorporating homology evidence fromprotein databases, achieving a 10%
reduction in base-level errors.
Yang Wang and Qiang Ji. ADynamic
Conditional Random Field Model for Object Segmentation inImage Sequences. In
IEEE Computer Society Conference onComputer Vision and Pattern
Recognition (CVPR 2005), Volume 1,2005.

This paper presents a dynamic conditional random field (DCRF) model
tointegrate contextual constraints for object segmentation in imagesequences.
Spatial and temporal dependencies within the segmentationprocess are unified by
a dynamic probabilistic framework based on theconditional random field (CRF). An
efficient approximate filteringalgorithm is derived for the DCRF model to
recursively estimate thesegmentation field from the history of video frames. The
segmentationmethod employs both intensity and motion cues, and it combines
dynamicinformation and spatial interaction of the observed data.
Experimentalresults show that the proposed approach effectively fuses
contextualconstraints in video sequences and improves the accuracy of
objectsegmentation.

software

MALLET: AMachine Learning for
Language Toolkit.

MALLET is an integrated collection of Java code useful for
statisticalnatural language processing, document classification,
clustering,information extraction, and other machine learning applications
totext.
ABNER: ABiomedical
Named Entity Recognizer.

ABNER is a text analysis tool for molecular biology. It
isessentially an interactive, user-friendly interface to a systemdesigned as
part of the NLPBA/BioNLP 2004 Shared Task challenge.
MinorThird.

MinorThird is a collection of Java classes for storing
text,annotating text, and learning to extract entities and
categorizetext.
Kevin
Murphy's MATLAB CRF code.

Conditional random fields (chains, trees and general graphs;
includesBP code).
Sunita Sarawagi's CRF package.

The CRF package is a Java implementation of conditional random
fields for sequential labeling.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: