您的位置:首页 > 其它

WINDOWS操作系统中可以允许最大的线程数

2017-07-09 23:47 183 查看
默认情况下,一个线程的栈要预留1M的内存空间
而一个进程中可用的内存空间只有2G,所以理论上一个进程中最多可以开2048个线程
但是内存当然不可能完全拿来作线程的栈,所以实际数目要比这个值要小。
你也可以通过连接时修改默认栈大小,将其改的比较小,这样就可以多开一些线程。
如将默认栈的大小改成512K,这样理论上最多就可以开4096个线程。

即使物理内存再大,一个进程中可以起的线程总要受到2GB这个内存空间的限制。
比方说你的机器装了64GB物理内存,但每个进程的内存空间还是4GB,其中用户态可用的还是2GB。

如果是同一台机器内的话,能起多少线程也是受内存限制的。每个线程对象都要站用非页面内存,而非页面内存也是有限的,当非页面内存被耗尽时,也就无法创建线程了。

如果物理内存非常大,同一台机器内可以跑的线程数目的限制值会越来越大。

在Windows下写个程序,一个进程Fork出2000个左右线程就会异常退出了,为什么?

这个问题的产生是因为windows32位系统,一个进程所能使用的最大虚拟内存为2G,而一个线程的默认线程栈StackSize为1024K(1M),这样当线程数量逼近2000时,2000*1024K=2G(大约),内存资源就相当于耗尽。

MSDN原文:

“Thenumberofthreadsaprocesscancreateislimitedbytheavailablevirtualmemory.Bydefault,everythreadhasonemegabyteofstackspace.Therefore,youcancreateatmost2,028threads.Ifyoureducethedefaultstacksize,youcancreatemorethreads.However,yourapplicationwillhavebetterperformanceifyoucreateonethreadperprocessorandbuildqueuesofrequestsforwhichtheapplicationmaintainsthecontextinformation.Athreadwouldprocessallrequestsinaqueuebeforeprocessingrequestsinthenextqueue.”

如何突破2000个限制?

可以通过修改CreateThread参数来缩小线程栈StackSize,例如

#defineMAX_THREADS50000


DWORDWINAPIThreadProc(LPVOIDlpParam){

while(1){

Sleep(100000);

}

return0;

}


intmain(){

DWORDdwThreadId[MAX_THREADS];

HANDLEhThread[MAX_THREADS];


for(inti=0;i<MAX_THREADS;++i)

{

hThread[i]=CreateThread(0,64,ThreadProc,0,STACK_SIZE_PARAM_IS_A_RESERVATION,&dwThreadId[i]);


if(0==hThread[i])

{

DWORDe=GetLastError();

printf("%d\r\n",e);

break;

}

}

ThreadProc(0);

}


服务器端程序设计

如果你的服务器端程序设计成:来一个client连接请求则创建一个线程,那么就会存在2000个限制(在硬件内存和CPU个数一定的情况下)。建议如下:

The"onethreadperclient"modeliswell-knownnottoscalebeyondadozenclientsorso.Ifyou'regoingtobehandlingmorethanthatmanyclientssimultaneously,youshouldmovetoamodelwhereinsteadofdedicatingathreadtoaclient,youinsteadallocateanobject.(SomedayI'llmuseonthedualitybetweenthreadsandobjects.)WindowsprovidesI/Ocompletionportsandathreadpooltohelpyouconvertfromathread-basedmodeltoawork-item-basedmodel.

1.Servemanyclientswitheachthread,andusenonblockingI/Oand[b]level-triggeredreadinessnotification[/b]
2.Servemanyclientswitheachthread,andusenonblockingI/Oandreadiness[b]changenotification[/b]
3.Servemanyclientswitheachserverthread,anduseasynchronousI/O

PushingtheLimitsofWindows:ProcessesandThreads

★★★★★
★★★★
★★★
★★



MarkRussinovichJuly5,200940

Share

0

0

ThisisthefourthpostinmyPushingtheLimitsofWindowsseriesthatexplorestheboundariesoffundamentalresourcesinWindows.Thistime,I’mgoingtodiscussthelimitsonthemaximumnumberofthreadsandprocessessupportedonWindows.I’llbrieflydescribethedifferencebetweenathreadandaprocess,surveythreadlimitsandtheninvestigateprocesslimits.Icoverthreadlimitsfirstsinceeveryactiveprocesshasatleastonethread(aprocessthat’sterminated,butiskeptreferencedbyahandleownedbyanotherprocesswon’thaveany),sothelimitonprocessesisdirectlyaffectedbythecapsthatlimitthreads.

UnlikesomeUNIXvariants,mostresourcesinWindowshavenofixedupperboundcompiledintotheoperatingsystem,butratherderivetheirlimitsbasedonbasicoperatingsystemresourcesthatI’vealreadycovered.Processandthreads,forexample,requirephysicalmemory,virtualmemory,andpoolmemory,sothenumberofprocessesorthreadsthatcanbecreatedonagivenWindowssystemisultimatelydeterminedbyoneoftheseresources,dependingonthewaythattheprocessesorthreadsarecreatedandwhichconstraintishitfirst.Ithereforerecommendthatyoureadtheprecedingpostsifyouhaven’t,becauseI’llbereferringtoreservedmemory,committedmemory,thesystemcommitlimitandotherconceptsI’vecovered.Here’stheindexoftheentirePushingtheLimitsseries.Whiletheycanstandontheirown,theyassumethatyoureadtheminorder.


PushingtheLimitsofWindows:PhysicalMemory

PushingtheLimitsofWindows:VirtualMemory

PushingtheLimitsofWindows:PagedandNonpagedPool

PushingtheLimitsofWindows:ProcessesandThreads

PushingtheLimitsofWindows:Handles

PushingtheLimitsofWindows:USERandGDIObjects–Part1

PushingtheLimitsofWindows:USERandGDIObjects–Part2


ProcessesandThreads

AWindowsprocessisessentiallycontainerthathoststheexecutionofanexecutableimagefile.ItisrepresentedwithakernelprocessobjectandWindowsusestheprocessobjectanditsassociateddatastructurestostoreandtrackinformationabouttheimage’sexecution.Forexample,aprocesshasavirtualaddressspacethatholdstheprocess’sprivateandshareddataandintowhichtheexecutableimageanditsassociatedDLLsaremapped.Windowsrecordstheprocess’suseofresourcesforaccountingandquerybydiagnostictoolsanditregisterstheprocess’sreferencestooperatingsystemobjectsintheprocess’shandletable.Processesoperatewithasecuritycontext,calledatoken,thatidentifiestheuseraccount,accountgroups,andprivilegesassignedtotheprocess.

Finally,aprocessincludesoneormorethreadsthatactuallyexecutethecodeintheprocess(technically,processesdon’trun,threadsdo)andthatarerepresentedwithkernelthreadobjects.Thereareseveralreasonsapplicationscreatethreadsinadditiontotheirdefaultinitialthread:processeswithauserinterfacetypicallycreatethreadstoexecuteworksothatthemainthreadremainsresponsivetouserinputandwindowingcommands;applicationsthatwanttotakeadvantageofmultipleprocessorsforscalabilityorthatwanttocontinueexecutingwhilethreadsaretiedupwaitingforsynchronousI/Ooperationstocompletealsobenefitfrommultiplethreads.

ThreadLimits

Besidesbasicinformationaboutathread,includingitsCPUregisterstate,schedulingpriority,andresourceusageaccounting,everythreadhasaportionoftheprocessaddressspaceassignedtoit,calledastack,whichthethreadcanuseasscratchstorageasitexecutesprogramcodetopassfunctionparameters,maintainlocalvariables,andsavefunctionreturnaddresses.Sothatthesystem’svirtualmemoryisn’tunnecessarilywasted,onlypartofthestackisinitiallyallocated,orcommittedandtherestissimplyreserved.Becausestacksgrowdownwardinmemory,thesystemplacesguardpagesbeyondthecommittedpartofthestackthattriggeranautomaticcommitmentofadditionalmemory(calledastackexpansion)whenaccessed.Thisfigureshowshowastack’scommittedregiongrowsdownandtheguardpagemoveswhenthestackexpands,witha32-bitaddressspaceasanexample(notdrawntoscale):





ThePortableExecutable(PE)structuresoftheexecutableimagespecifytheamountofaddressspacereservedandinitiallycommittedforathread’sstack.Thelinkerdefaultstoareserveof1MBandcommitofonepage(4K),butdeveloperscanoverridethesevalueseitherbychangingthePEvalueswhentheylinktheirprogramorforanindividualthreadinacalltoCreateThread.YoucanuseatoollikeDumpbinthatcomeswithVisualStudiotolookatthesettingsforanexecutable.Here’stheDumpbinoutputwiththe/headersoptionfortheexecutablegeneratedbyanewVisualStudioproject:





Convertingthenumbersfromhexadecimal,youcanseethestackreservesizeis1MBandtheinitialcommitis4KandusingthenewSysinternalsVMMaptooltoattachtothisprocessandviewitsaddressspace,youcanclearlyseeathreadstack’sinitialcommittedpage,aguardpage,andtherestofthereservedstackmemory:





Becauseeachthreadconsumespartofaprocess’saddressspace,processeshaveabasiclimitonthenumberofthreadstheycancreatethat’simposedbythesizeoftheiraddressspacedividedbythethreadstacksize.

32-bitThreadLimits

Evenifthethreadhadnocodeordataandtheentireaddressspacecouldbeusedforstacks,a32-bitprocesswiththedefault2GBaddressspacecouldcreateatmost2,048threads.Here’stheoutputoftheTestlimittoolrunningon32-bitWindowswiththe–tswitch(createthreads)confirmingthatlimit:





Again,sincepartoftheaddressspacewasalreadyusedbythecodeandinitialheap,notallofthe2GBwasavailableforthreadstacks,thusthetotalthreadscreatedcouldnotquitereachthetheoreticallimitof2,048.

IlinkedtheTestlimitexecutablewiththelargeaddressspace-awareoption,meaningthatifit’spresentedwithmorethan2GBofaddressspace(forexampleon32-bitsystemsbootedwiththe/3GBor/USERVABoot.inioptionoritsequivalentBCDoptiononVistaandlaterincreaseuserva),itwilluseit.32-bitprocessesaregiven4GBofaddressspacewhentheyrunon64-bitWindows,sohowmanythreadscanthe32-bitTestlimitcreatewhenrunon64-bitWindows?Basedonwhatwe’vecoveredsofar,theanswershouldberoughly4096(4GBdividedby1MB),butthenumberisactuallysignificantlysmaller.Here’s32-bitTestlimitrunningon64-bitWindowsXP:





Thereasonforthediscrepancycomesfromthefactthatwhenyouruna32-bitapplicationon64-bitWindows,itisactuallya64-bitprocessthatexecutes64-bitcodeonbehalfofthe32-bitthreads,andthereforethereisa64-bitthreadstackanda32-bitthreadstackareareservedforeachthread.The64-bitstackhasareserveof256K(exceptthatonsystemspriortoVista,theinitialthread’s64-bitstackis1MB).Becauseevery32-bitthreadbeginsitslifein64-bitmodeandthestackspaceituseswhenstartingexceedsapage,you’lltypicallyseeatleast16KBofthe64-bitstackcommitted.Here’sanexampleofa32-bitthread’s64-bitand32-bitstacks(theonelabeled“Wow64”isthe32-bitstack):





32-bitTestlimitwasabletocreate3,204threadson64-bitWindows,whichgiventhateachthreaduses1MB+256Kofaddressspaceforstack(again,exceptthefirstonversionsofWindowspriortoVista,whichuses1MB+1MB),isexactlywhatyou’dexpect.IgotdifferentresultswhenIran32-bitTestlimiton64-bitWindows7,however:





ThedifferencebetweentheWindowsXPresultandtheWindows7resultiscausedbythemorerandomnatureofaddressspacelayoutintroducedinWindowsVista,AddressSpaceLoadRandomization(ASLR),thatleadstosomefragmentation.RandomizationofDLLloading,threadstackandheapplacement,helpsdefendagainstmalwarecodeinjection.AsyoucanseefromthisVMMapoutput,there’s357MBofaddressspacestillavailable,butthelargestfreeblockisonly128Kinsize,whichissmallerthanthe1MBrequiredfora32-bitstack:





AsImentioned,adevelopercanoverridethedefaultstackreserve.Onereasontodosoistoavoidwastingaddressspacewhenathread’sstackusagewillalwaysbesignificantlylessthanthedefault1MB.TestlimitsetsthedefaultstackreservationinitsPEimageto64Kandwhenyouincludethe–nswitchalongwiththe–tswitch,Testlimitcreatesthreadswith64Kstacks.Here’stheoutputona32-bitWindowsXPsystemwith256MBRAM(Ididthisexperimentonasmallsystemtohighlightthisparticularlimit):





Notethedifferenterror,whichimpliesthataddressspaceisn’ttheissuehere.Infact,64Kstacksshouldallowforaround32,000threads(2GB/64K=32,768).What’sthelimitthat’sbeinghitinthiscase?Alookatthelikelycandidates,includingcommitandpool,don’tgiveanyclues,asthey’reallbelowtheirlimits:





It’sonlyalookatadditionalmemoryinformationinthekerneldebuggerthatrevealsthethresholdthat’sbeinghit,residentavailablememory,whichhasbeenexhausted:





ResidentavailablememoryisthephysicalmemorythatcanbeassignedtodataorcodethatmustbekeptinRAM.Nonpagedpoolandnonpageddriverscountagainstit,forexample,asdoesmemorythat’slockedinRAMfordeviceI/Ooperations.Everythreadhasbothauser-modestack,whichiswhatI’vebeentalkingabout,buttheyalsohaveakernel-modestackthat’susedwhentheyruninkernelmode,forexamplewhileexecutingsystemcalls.Whenathreadisactiveitskernelstackislockedinmemorysothatthethreadcanexecutecodeinthekernelthatcan’tpagefault.

Abasickernelstackis12Kon32-bitWindowsand24Kon64-bitWindows.14,225threadsrequireabout170MBofresidentavailablememory,whichcorrespondstoexactlyhowmuchisfreeonthissystemwhenTestlimitisn’trunning:





Oncetheresidentavailablememorylimitishit,manybasicoperationsbeginfailing.Forexample,here’stheerrorIgotwhenIdouble-clickedonthedesktop’sInternetExplorershortcut:





Asexpected,whenrunon64-bitWindowswith256MBofRAM,Testlimitisonlyabletocreate6,600threads–roughlyhalfwhatitcreatedon32-bitWindowswith256MBRAM–beforerunningoutofresidentavailablememory:





ThereasonIsaid“basic”kernelstackearlieristhatathreadthatexecutesgraphicsorwindowingfunctionsgetsa“large”stackwhenitexecutesthefirstcallthat’s20Kon32-bitWindowsand48Kon64-bitWindows.Testlimit’sthreadsdon’tcallanysuchAPIs,sotheyhavebasickernelstacks.

64-bitThreadLimits

Like32-bitthreads,64-bitthreadsalsohaveadefaultof1MBreservedforstack,but64-bitprocesseshaveamuchlargeruser-modeaddressspace(8TB),soaddressspaceshouldn’tbeanissuewhenitcomestocreatinglargenumbersofthreads.Residentavailablememoryisobviouslystillapotentiallimiter,though.The64-bitversionofTestlimit(Testlimit64.exe)wasabletocreatearound6,600threadswithandwithoutthe–nswitchonthe256MB64-bitWindowsXPsystem,thesamenumberthatthe32-bitversioncreated,becauseitalsohittheresidentavailablememorylimit.However,onasystemwith2GBofRAM,Testlimit64wasabletocreateonly55,000threads,farbelowthenumberitshouldhavebeenabletoifresidentavailablememorywasthelimiter(2GB/24K=89,000):





Inthiscase,it’stheinitialthreadstackcommitthatcausesthesystemtorunoutofvirtualmemoryandthe“pagingfileistoosmall”error.OncethecommitlevelreachedthesizeofRAM,therateofthreadcreationslowedtoacrawlbecausethesystemstartedthrashing,pagingoutstacksofthreadscreatedearliertomakeroomforthestacksofnewthreads,andthepagingfilehadtoexpand.Theresultsarethesamewhenthe–nswitchisspecified,becausethethreadshavethesameinitialstackcommitment.

ProcessLimits

ThenumberofprocessesthatWindowssupportsobviouslymustbelessthanthenumberofthreads,sinceeachprocesshasonethreadandaprocessitselfcausesadditionalresourceusage.32-bitTestlimitrunningona2GB64-bitWindowsXPsystemcreatedabout8,400processes:





Alookinthekerneldebuggershowsthatithittheresidentavailablememorylimit:





Iftheonlycostofaprocesswithrespecttoresidentavailablememorywasthekernel-modethreadstack,Testlimitwouldhavebeenabletocreatefarmorethan8,400threadsona2GBsystem.TheamountofresidentavailablememoryonthissystemwhenTestlimitisn’trunningis1.9GB:





DividingtheamountofresidentmemoryTestlimitused(1.9GB)bythenumberofprocessesitcreated(8,400)yields230Kofresidentmemoryperprocess.Sincea64-bitkernelstackis24K,thatleavesabout206Kunaccountedfor.Where’stherestofthecostcomingfrom?Whenaprocessiscreated,Windowsreservesenoughphysicalmemorytoaccommodatetheprocess’sminimumworkingsetsize.Thisactsasaguaranteetotheprocessthatnomatterwhat,therewillenoughphysicalmemoryavailabletoholdenoughdatatosatisfyitsminimumworkingset.Thedefaultworkingsetsizehappenstobe200KB,afactthat’sevidentwhenyouaddtheMinimumWorkingSetcolumntoProcessExplorer’sdisplay:





Theremainingroughly6Kisresidentavailablememorychargedforadditionalnon-pageablememoryallocatedtorepresentaprocess.Aprocesson32-bitWindowswilluseslightlylessresidentmemorybecauseitskernel-modethreadstackissmaller.

Astheycanforuser-modethreadstacks,processescanoverridetheirdefaultworkingsetsizewiththeSetProcessWorkingSetSizefunction.Testlimitsupportsa–nswitch,thatwhencombinedwith–p,causeschildprocessesofthemainTestlimitprocesstosettheirworkingsettotheminimumpossible,whichis80K.Becausethechildprocessesmustruntoshrinktheirworkingsets,Testlimitsleepsafteritcan’tcreateanymoreprocessesandthentriesagaintogiveitschildrenachancetoexecute.Testlimitexecutedwiththe–nswitchonaWindows7systemwith4GBofRAMhitalimitotherthanresidentavailablememory:thesystemcommitlimit:





Hereyoucanseethekerneldebuggerreportingnotonlythatthesystemcommitlimithadbeenhit,butthattherehavebeenthousandsofmemoryallocationfailures,bothvirtualandpagedpoolallocations,followingtheexhaustionofthecommitlimit(thesystemcommitlimitwasactuallyhitseveraltimesasthepagingfilewasfilledandthengrowntoraisethelimit):





ThebaselinecommitmentbeforeTestlimitranwasabout1.5GB,sothethreadshadconsumedabout8GBofcommittedmemory.Eachprocessthereforeconsumedroughly8GB/6,600,or1.2MB.Theoutputofthekerneldebugger’s!vmcommand,whichshowstheprivatememoryallocatedbyeachactiveprocess,confirmsthatcalculation:





Theinitialthreadstackcommitment,describedearlier,hasanegligibleimpactwiththerestcomingfromthememoryrequiredfortheprocessaddressspacedatastructures,pagetableentries,thehandletable,processandthreadobjects,andprivatedatatheprocesscreateswhenitinitializes.

HowManyThreadsandProcessesareEnough?

Sotheanswertothequestions,“howmanythreadsdoesWindowssupport?”and“howmanyprocessescanyourunconcurrentlyonWindows?”depends.Inadditiontothenuancesofthewaythatthethreadsspecifytheirstacksizesandprocessesspecifytheirminimumworkingsets,thetwomajorfactorsthatdeterminetheansweronanyparticularsystemincludetheamountofphysicalmemoryandthesystemcommitlimit.Inanycase,applicationsthatcreateenoughthreadsorprocessestogetanywhereneartheselimitsshouldrethinktheirdesign,astherearealmostalwaysalternatewaystoaccomplishthesamegoalswithareasonablenumber.Forinstance,thegeneralgoalforascalableapplicationistokeepthenumberofthreadsrunningequaltothenumberofCPUs(withNUMAchangingthistoconsiderCPUspernode)andonewaytoachievethatistoswitchfromusingsynchronousI/OtousingasynchronousI/OandrelyonI/OcompletionportstohelpmatchthenumberofrunningthreadstothenumberofCPUs.

Win7,Server2008R2最大线程数限制

最近在做压力测试时发现Win7和Server2008R2系统内线程数设为1500则无法创建线程池,深入分析发现32位和64位程序存在很大性能差异。

最大线程数:

32bit:1450

64bit:100000

测试代码如下:

[cpp]viewplaincopy

#include"stdafx.h"

#include<stdio.h>

#include<windows.h>

DWORDCALLBACKThreadProc(void*)

{

Sleep(INFINITE);

return0;

}

int__cdeclmain(intargc,constchar*argv[])

{

inti;

for(i=0;i<100000;i++)

{

DWORDid;

HANDLEh=CreateThread(NULL,4096,ThreadProc,NULL,STACK_SIZE_PARAM_IS_A_RESERVATION,&id);

if(!h)

break;

CloseHandle(h);

printf("%d\n",i);

}

//default1413[3/18/2012WangJinhui]

printf("Created%dthreads\n",i);return0;

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐
章节导航