您的位置:首页 > 其它

CloudStack 故障排查汇总-不定期更新

2015-10-11 15:34 381 查看
工作中经常用到CloudStack,过程中发现的一些故障排查分享出来,希望可以帮到大家。

一、添加主机失败

现象1:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33
[root@mgmt~]
#
tail-f/var/log/cloudstack/management/management-server.log


2014-02-2811:05:32,172DEBUG[kvm.discoverer.LibvirtServerDiscoverer]
(catalina-
exec
-22:null)
Timeout,towait
for
the
hostconnectingtomgtsvr,assumingitisfailed


2014-02-2811:05:32,205WARN[cloud.resource.ResourceManagerImpl](catalina-
exec
-22:null)
Unableto
find
the
serverresourcesathttp:
//192
.168.150.250


2014-02-2811:05:32,220INFO[utils.exception.CSExceptionErrorCode](catalina-
exec
-22:null)
Couldnot
find
exception:
com.cloud.exception.DiscoveryException
in
error
codelist
for
exceptions


2014-02-2811:05:32,220WARN[admin.host.AddHostCmd](catalina-
exec
-22:null)
Exception:


com.cloud.exception.DiscoveryException:Unabletoaddthehost


at
com.cloud.resource.ResourceManagerImpl.discoverHostsFull(ResourceManagerImpl.java:798)


at
com.cloud.resource.ResourceManagerImpl.discoverHosts(ResourceManagerImpl.java:590)


at
org.apache.cloudstack.api.
command
.admin.host.AddHostCmd.execute(AddHostCmd.java:143)


at
com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)


at
com.cloud.api.ApiServer.queueCommand(ApiServer.java:514)


at
com.cloud.api.ApiServer.handleRequest(ApiServer.java:372)


at
com.cloud.api.ApiServlet.processRequest(ApiServlet.java:305)


at
com.cloud.api.ApiServlet.doPost(ApiServlet.java:71)


at
javax.servlet.http.HttpServlet.service(HttpServlet.java:637)


at
javax.servlet.http.HttpServlet.service(HttpServlet.java:717)


at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)


at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)


at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)


at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)


at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)


at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)


at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:555)


at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)


at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)


at
org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)


at
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:721)


at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2268)


at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)


at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)


at
java.lang.Thread.run(Thread.java:679)


2014-02-2811:05:32,222INFO[cloud.api.ApiServer](catalina-
exec
-22:null)
Unabletoaddthehost


2014-02-2811:05:32,224DEBUG[cloud.api.ApiServlet](catalina-
exec
-22:null)
===END===192.168.151.234--POST
command
=addHost&response=json&sessionkey=GEI3EIOONoV5RG9Mcs4xcdx31oc%3D


现象2:

1

2
[root@kvm01agent]
#
/etc/init.d/cloudstack-agentstatus##查看kvm主机的cloudstack-agent服务状态


cloudstack-agentdeadbutsubsyslocked


现象3:

1

2
[root@kvm01agent]
#
cat/var/log/cloudstack/agent/agent.log##查看kvm主机的agent.log日志中的异常


ERROR[cloud.agent.AgentShell](main:null)Unabletostartagent:NOHVM
supportonthismachine,please
make
sure:
1.VT
/SVM
is
supportedbyyourCPU,orisenabled
in
BIOS.
2.kvmmodulesareloaded(kvm,kvm_amd|kvm_intel)


解决方法:

1.必须安装虚拟化套件支持

1
[root@kvm01agent]
#
yum-ygroupinstall'Virtualization''VirtualizationClient''VirtualzationPlatform''VirtualizationTools'


2.确认kvm模块已经被正确加载

1

2

3
[root@kvm01~]
#
lsmod|grepkvm


kvm_intel525700


kvm3147391kvm_intel


如果没有任何信息,请使用如下命令加载kvm模块:

1

2
[root@kvm01~]
#
modprobekvm_intel##intel平台


[root@kvm01~]
#
modprobekvm_amd##amd平台


3.再次添加。

福利:

关于添加主机过程中的错误,千奇百怪,而java的报错又。。。教给大家一个小技巧:

当添加主机报错,日志中有没有明确原因时,可以手动在agent上面执行添加主机的命令。具体添加主机的命令可以在management的日志中获得:

1

2

3

4

5
[root@localhostmanagement]#cat/
var
/log/cloudstack/management/management-server.log
|grepcloudstack-setup-agent


2014
-
03
-
13
09
:
56
:
17
,
758
DEBUG
[utils.ssh.SSHCmdHelper](catalina-exec-
11
:
null
)
Executingcmd:cloudstack-setup-agent-m
192.168
.
153.28
-z
2
-p
2
-c
2
-g
0d21492f-
9565
-329d-9a26-0c85f6d39d12
-a--pubNic=cloud0--prvNic=cloud0--guestNic=cloud0


2014
-
03
-
13
09
:
56
:
52
,
775
DEBUG
[utils.ssh.SSHCmdHelper](catalina-exec-
11
:
null
)
cloudstack-setup-agent-m
192.168
.
153.28
-z
2
-p
2
-c
2
-g
0d21492f-
9565
-329d-9a26-0c85f6d39d12
-a--pubNic=cloud0--prvNic=cloud0--guestNic=cloud0output:CloudStackAgentsetup
is
done!


2014
-
03
-
13
11
:
12
:
22
,
455
DEBUG
[utils.ssh.SSHCmdHelper](catalina-exec-
12
:
null
)
Executingcmd:cloudstack-setup-agent-m
192.168
.
153.28
-z
3
-p
3
-c
3
-g
0d21492f-
9565
-329d-9a26-0c85f6d39d12
-a--pubNic=cloud0--prvNic=cloud0--guestNic=cloud0


2014
-
03
-
13
11
:
12
:
57
,
267
DEBUG
[utils.ssh.SSHCmdHelper](catalina-exec-
12
:
null
)
cloudstack-setup-agent-m
192.168
.
153.28
-z
3
-p
3
-c
3
-g
0d21492f-
9565
-329d-9a26-0c85f6d39d12
-a--pubNic=cloud0--prvNic=cloud0--guestNic=cloud0output:CloudStackAgentsetup
is
done!


比如我上面的例子,得到如下命令,并在agent上面执行:

1

2

3

4

5

6

7

8

9

10

11
[root@kvm01~]
#
cloudstack-setup-agent-m192.168.153.28-z3-p3-c3-g0d21492f-9565-329d-9a26-0c85f6d39d12-a--pubNic=cloud0--prvNic=cloud0--guestNic=cloud0


Startingtoconfigureyoursystem:


ConfigureCgroup...[OK]


ConfigureSElinux...[OK]


ConfigureNetwork...[OK]


ConfigureLibvirt...[OK]


ConfigureFirewall...[OK]


ConfigureNfs...[OK]


ConfigurecloudAgent...[OK]


CloudStackAgentsetupis
done
!


[root@kvm01~]
#


这个过程中,如果报错,就很轻易就能判断出问题是出在哪一步。

另外,上面cloudstack-setup-agent命令的参数如下,根据自己的情况改写:

1

2

3

4

5

6

7

8

9

10

11

12

13

14
[root@dbserver~]
#
cloudstack-setup-agent-h


Usage:cloudstack-setup-agent[options]


Options:


-h,
--helpshowthishelpmessageand
exit


-a
automode


-m
MGT,--host=MGTManagementserver
hostname
or
IP-Address


-z
ZONE,--zone=ZONEzone
id


-p
POD,--pod=PODpod
id


-c
CLUSTER,--cluster=CLUSTER


cluster
id


-g
GUID,--guid=GUIDguid


--pubNic=PUBNIC
Publictrafficinterface


--prvNic=PRVNIC
Privatetrafficinterface


--guestNic=GUESTNIC
Guesttrafficinterface


至于参数后面具体的值,可以从agent主机的/etc/cloudstack/agent/agent.properties中获得:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17
[root@kvm01~]
#
cat/etc/cloudstack/agent/agent.properties


#Storage


#ThuMar1311:23:48CST2014


guest.network.device=cloud0


workers=5


private.network.device=cloud0


port=8250


resource=com.cloud.hypervisor.kvm.resource.LibvirtComputingResource


pod=3


zone=3


guid=0d21492f-9565-329d-9a26-0c85f6d39d12


public.network.device=cloud0


cluster=3


local
.storage.uuid=ac70655b-f452-4d14-a1a1-2a5eebc4bb01


domr.scripts.
dir
=scripts
/network/domr/kvm


LibvirtComputingResource.
id
=0


host=192.168.153.28


持续更新中。。。

本文出自“systems”博客,请务必保留此出处/article/4532068.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: