您的位置:首页 > 其它

[Hive - LanguageManual] DML: Load, Insert, Update, Delete

2015-01-25 12:41 323 查看

LanguageManualDML

HiveDataManipulationLanguage

HiveDataManipulationLanguage

Loadingfilesintotables

Syntax

Synopsis

Notes

InsertingdataintoHiveTablesfromqueries

Syntax

Synopsis

Notes

DynamicPartitionInserts

Example

AdditionalDocumentation

Writingdataintothefilesystemfromqueries

Syntax

Synopsis

Notes

InsertingvaluesintotablesfromSQL

Syntax

Synopsis

Examples

Update

Syntax

Synopsis

Notes

Delete

Syntax

Synopsis

Notes

TherearemultiplewaystomodifydatainHive:

LOAD

INSERT

intoHivetablesfromqueries

intodirectoriesfromqueries

intoHivetablesfromSQL

UPDATE

DELETE

EXPORTandIMPORTcommandsarealsoavailable(asofHive0.8).

Loadingfilesintotables

Hivedoesnotdoanytransformationwhileloadingdataintotables.Loadoperationsarecurrentlypurecopy/move(纯复制,移动)operationsthatmovedatafilesintolocationscorrespondingtoHivetables.

Syntax语法

Synopsis简介
Loadoperationsarecurrentlypurecopy/moveoperationsthatmovedatafilesintolocationscorrespondingtoHivetables.

filepathcanbe:

arelativepath,suchas
project/data1


anabsolutepath,suchas
/user/hive/project/data1


afullURIwithschemeand(optionally)anauthority,suchas
hdfs://namenode:9000/user/hive/project/data1


Thetargetbeingloadedtocanbeatableorapartition.Ifthetableispartitioned,thenonemustspecifyaspecificpartitionofthetablebyspecifyingvaluesforallofthepartitioningcolumns.

filepathcanrefertoafile(inwhichcaseHivewillmovethefileintothetable)oritcanbeadirectory(inwhichcaseHivewillmoveallthefileswithinthatdirectoryintothetable).Ineithercase,filepathaddressesasetoffiles.

IfthekeywordLOCALisspecified,then:

theloadcommandwilllookforfilepathinthelocalfilesystem.Ifarelativepathisspecified,itwillbeinterpretedrelativetotheuser'scurrentworkingdirectory.TheusercanspecifyafullURIforlocalfilesaswell-forexample:
file:///user/hive/project/data1


theloadcommandwilltrytocopyallthefilesaddressedbyfilepathtothetargetfilesystem.Thetargetfilesystemisinferredbylookingatthelocationattributeofthetable.Thecopieddatafileswillthenbemovedtothetable.

IfthekeywordLOCALisnotspecified,thenHivewilleitherusethefullURIoffilepath,ifoneisspecified,orwillapplythefollowingrules:

Ifschemeorauthorityarenotspecified,Hivewillusetheschemeandauthorityfromthehadoopconfigurationvariable
fs.default.name
thatspecifiestheNamenodeURI.

Ifthepathisnotabsolute,thenHivewillinterpretitrelativeto
/user/<username>


Hivewillmovethefilesaddressedbyfilepathintothetable(orpartition)

IftheOVERWRITEkeywordisusedthenthecontentsofthetargettable(orpartition)willbedeletedandreplacedbythefilesreferredtobyfilepath;otherwisethefilesreferredbyfilepathwillbeaddedtothetable.

Notethatifthetargettable(orpartition)alreadyhasafilewhosenamecollides(冲突)withanyofthefilenamescontainedinfilepath,thentheexistingfilewillbereplacedwiththenewfile.

Notes

filepathcannotcontainsubdirectories.(filepath可以是目录也可以是文件,但是不能包含子目录)

IfthekeywordLOCALisnotgiven,filepathmustrefertofileswithinthesamefilesystemasthetable's(orpartition's)location.

Hivedoessomeminimalcheckstomakesurethatthefilesbeingloadedmatchthetargettable.Currentlyitchecksthatifthetableisstoredinsequencefileformat,thefilesbeingloadedarealsosequencefiles,andviceversa.

Abugthatpreventedloadingafilewhenitsnameincludesthe"+"characterisfixedinrelease0.13.0(HIVE-6048).

PleasereadCompressedStorageifyourdatafileiscompressed.

InsertingdataintoHiveTablesfromqueries

QueryResultscanbeinsertedintotablesbyusingtheinsertclause.

Syntax语法

Synopsis

INSERTOVERWRITEwilloverwriteanyexistingdatainthetableorpartition

unless
IFNOTEXISTS
isprovidedforapartition(asofHive0.9.0).

INSERTINTOwillappendtothetableorpartition,keepingtheexistingdataintact(完整无缺的).(Note:INSERTINTOsyntaxisonlyavailablestartinginversion0.8.)

AsofHive0.13.0,atablecanbemadeimmutable(不可变的)bycreatingitwithTBLPROPERTIES("immutable"="true").Thedefaultis"immutable"="false".
INSERTINTObehaviorintoanimmutabletableisdisallowedifanydataisalreadypresent,althoughINSERTINTOstillworksiftheimmutabletableisempty.ThebehaviorofINSERTOVERWRITEisnotaffectedbythe"immutable"tableproperty.(INSERTOVERWRITE不受immutable属性的限制)
Animmutabletableisprotectedagainstaccidentalupdatesduetoascriptloadingdataintoitbeingrunmultipletimesbymistake.(避免多次插入和修改)Thefirstinsertintoanimmutabletablesucceedsandsuccessiveinsertsfail,resultinginonlyonesetofdatainthetable,insteadofsilentlysucceedingwithmultiplecopiesofthedatainthetable.

Insertscanbedonetoatableorapartition.Ifthetableispartitioned,thenonemustspecifyaspecificpartitionofthetablebyspecifyingvaluesforallofthepartitioningcolumns.

Multipleinsertclauses(alsoknownasMultiTableInsert)canbespecifiedinthesamequery.

Theoutputofeachoftheselectstatementsiswrittentothechosentable(orpartition).CurrentlytheOVERWRITEkeywordismandatory(强制的)andimplies(暗示,说明)thatthecontentsofthechosentableorpartitionarereplacedwiththeoutputofcorresponding(适当的)selectstatement.

Theoutputformatandserializationclassisdeterminedbythetable'smetadata(asspecifiedviaDDLcommandsonthetable).

AsofHive0.14,ifatablehasanOutputFormatthatimplementsAcidOutputFormatandthesystemisconfiguredtouseatransactionmanagerthatimplementsACID,thenINSERTOVERWRITEwillbedisabledforthattable.Thisistoavoidusersunintentionallyoverwritingtransactionhistory.ThesamefunctionalitycanbeachievedbyusingTRUNCATETABLE(fornon-partitionedtables)orDROPPARTITIONfollowedbyINSERTINTO.

Notes

MultiTableInsertsminimizethenumberofdatascansrequired.Hivecaninsertdataintomultipletablesbyscanningtheinputdatajustonce(andapplyingdifferentqueryoperators)totheinputdata.

StartingwithHive0.13.0,theselectstatementcanincludeoneormorecommontableexpressions(CTEs)asshownintheSELECTsyntax.Foranexample,seeCommonTableExpression.

DynamicPartitionInserts动态分区插入

Versioninformation

Icon

ThisinformationreflectsthesituationinHive0.12;dynamicpartitioninsertswereaddedinHive0.6.

Inthedynamicpartitioninserts,userscangivepartialpartitionspecifications,whichmeansjustspecifyingthelistofpartitioncolumnnamesinthePARTITIONclause.Thecolumnvaluesareoptional.Ifapartitioncolumnvalueisgiven,wecallthisastaticpartition,otherwiseitisadynamicpartition.Eachdynamicpartitioncolumnhasacorrespondinginputcolumnfromtheselectstatement.Thismeansthatthedynamicpartitioncreationisdeterminedbythevalueoftheinputcolumn.ThedynamicpartitioncolumnsmustbespecifiedlastamongthecolumnsintheSELECTstatementandinthesameorderinwhichtheyappearinthePARTITION()clause.

DynamicPartitioninsertsaredisabledbydefault.Thesearetherelevant(相关的)configurationpropertiesfordynamicpartitioninserts:

Configurationproperty

Default

Note

hive.exec.dynamic.partition


false


Needstobesetto
true
toenabledynamicpartitioninserts

hive.exec.dynamic.partition.mode


strict


In
strict
mode,theusermustspecifyatleastonestaticpartitionincasetheuseraccidentallyoverwritesallpartitions,in
nonstrict
modeallpartitionsareallowedtobedynamic

hive.exec.max.dynamic.partitions.pernode


100

Maximumnumberofdynamicpartitionsallowedtobecreatedineachmapper/reducernode

hive.exec.max.dynamic.partitions


1000

Maximumnumberofdynamicpartitionsallowedtobecreatedintotal

hive.exec.max.created.files


100000

MaximumnumberofHDFSfilescreatedbyallmappers/reducersinaMapReducejob

hive.error.on.empty.partition


false


Whethertothrowanexceptionifdynamicpartitioninsertgeneratesemptyresults

Example

Herethe
country
partitionwillbedynamicallycreatedbythelastcolumnfromthe
SELECT
clause(i.e.
pvs.cnt
).Notethatthenameisnotused.In
nonstrict
modethe
dt
partitioncouldalsobedynamicallycreated.

AdditionalDocumentation

DesignDocument

Originaldesigndoc

HIVE-936

Tutorial:Dynamic-PartitionInsert

HCatalogDynamicPartitioning

UsagewithPig

UsagefromMapReduce

Writingdataintothefilesystemfromqueries

Queryresultscanbeinsertedintofilesystemdirectoriesbyusingaslightvariation(细微的变化)ofthesyntaxabove:

Syntax

Synopsis

DirectorycanbeafullURI.Ifschemeorauthorityarenotspecified,Hivewillusetheschemeandauthorityfromthehadoopconfigurationvariable
fs.default.name
thatspecifiestheNamenodeURI.

IfLOCALkeywordisused,Hivewillwritedatatothedirectoryonthelocalfilesystem.

Datawrittentothefilesystemisserializedastextwithcolumnsseparatedby^Aandrowsseparatedbynewlines.Ifanyofthecolumnsarenotofprimitivetype,thenthosecolumnsareserializedtoJSONformat.

Notes

INSERTOVERWRITEstatementstodirectories,localdirectories,andtables(orpartitions)canallbeusedtogetherwithinthesamequery.

INSERTOVERWRITEstatementstoHDFSfilesystemdirectoriesarethebestwaytoextractlargeamountsofdatafromHive.HivecanwritetoHDFSdirectoriesinparallelfromwithinamap-reducejob.

Thedirectoryis,asyouwouldexpect,OVERWRITten;inotherwords,ifthespecifiedpathexists,itisclobberedandreplacedwiththeoutput.

AsofHive0.11.0theseparatorusedcanbespecified,inearlierversionsitwasalwaysthe^Acharacter(\001)

InHive0.14,insertsintoACIDcomplianttableswilldeactivatevectorizationforthedurationoftheselectandinsert.Thiswillbedoneautomatically.ACIDtablesthathavedatainsertedintothemcanstillbequeriedusingvectorization.

InsertingvaluesintotablesfromSQL

TheINSERT...VALUESstatementcanbeusedtoinsertdataintotablesdirectlyfromSQL.

VersionInformation

Icon

INSERT...VALUESisavailablestartinginHive0.14.

InsertingvaluesfromSQLstatementscanonlybeperformedontablesthatsupportACID.SeeHiveTransactionsfordetails.

Syntax

Synopsis

EachrowlistedintheVALUESclauseisinsertedintotabletablename.

Valuesmustbeprovidedforeverycolumninthetable.ThestandardSQLsyntaxthatallowstheusertoinsertvaluesintoonlysomecolumnsisnotyetsupported.TomimicthestandardSQL,nullscanbeprovidedforcolumnstheuserdoesnotwishtoassignavalueto.

DynamicpartitioningissupportedinthesamewayasforINSERT...SELECT.

IfthetablebeinginsertedintosupportsACIDandatransactionmanagerthatsupportsACIDisinuse,thisoperationwillbeauto-committeduponsuccessfulcompletion.

Insert,update,deleteoperationsarenotsupportedontablesthataresorted(tablescreatedwiththeSORTEDBYclause).

Hivedoesnotsupportliteralsforcomplextypes,soitisnotpossibletousetheminINSERTINTO...VALUESclauses.


Meansusercannotinsertdatainto
complexdatatype[array,map,struct,union]columnsusingINSERTINTO...VALUESclause.

Examples

Update

VersionInformation

Icon

UPDATEisavailablestartinginHive0.14.

UpdatescanonlybeperformedontablesthatsupportACID.SeeHiveTransactionsfordetails.

Syntax

Synopsis

Thereferencedcolumnmustbeacolumnofthetablebeingupdated.

ThevalueassignedmustbeanexpressionthatHivesupportsintheselectclause.Thusarithmeticoperators,UDFs,casts,literals,etc.aresupported.Subqueriesarenotsupported.

OnlyrowsthatmatchtheWHEREclausewillbeupdated.

Partitioningcolumnscannotbeupdated.

Bucketingcolumnscannotbeupdated.

InHive0.14,uponsuccessfulcompletionofthisoperationthechangeswillbeauto-committed.

Notes

Vectorizationwillbeturnedoffforupdateoperations.Thisisautomaticandrequiresnoactiononthepartoftheuser.Non-updateoperationsarenotaffected.Updatedtablescanstillbequeriedusingvectorization.

Inversion0.14itisrecommendedthatyousethive.optimize.sort.dynamic.partition=falsewhendoingupdates,asthisproducesmoreefficientexecutionplans.

Delete

VersionInformation

Icon

DELETEisavailablestartinginHive0.14.

DeletescanonlybeperformedontablesthatsupportACID.SeeHiveTransactionsfordetails.

Syntax

Synopsis

OnlyrowsthatmatchtheWHEREclausewillbedeleted.

InHive0.14,uponsuccessfulcompletionofthisoperationthechangeswillbeauto-committed.

Notes

Vectorizationwillbeturnedofffordeleteoperations.Thisisautomaticandrequiresnoactiononthepartoftheuser.Non-deleteoperationsarenotaffected.Tableswithdeleteddatacanstillbequeriedusingvectorization.

Inversion0.14itisrecommendedthatyousethive.optimize.sort.dynamic.partition=falsewhendoingdeletes,asthisproducesmoreefficientexecutionplans.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: