整数转变成字符串遇到的一个问题
2010-12-08 19:35
295 查看
To build Heritrix in Eclipse在eclipse中搭建heritrix
文章分类:Java编程
To build Heritrix in Eclipse
This uses Heritrix 1.14.4 (2010 Year 5 dated 10 version is the latest version of the current situation)
1. First of all download from http://sourceforge.net/projects/archive-crawler/ heritrix-1.14.4.zip
heritrix-1.14.4-src.zip
2. In Eclipse create a java project in the works, respectively,
heritrix-1.14.4.zip
heritrix-1.14.4-src.zip to extract.
3.copy folder “com, org, st” in heritrix-1.14.4-src.zip to the src folder of the project. –“D:\workspace_eclipse\heritrix2\src”
4. copy the content of folder “src/conf/” in heritrix-1.14.4-src.zip to the src folder of the project.“D:\workspace_eclipse\heritrix2\src”
5. copy all .jar in the lib folder of heritrix-1.14.4-src.zip Unzip to the lib folder of project.
6.
copy “src / resources / org / archive / util in tlds-alpha-by-domain.txt “file in the lib folder of heritrix-1.14.4-src.zip Unzip to the corresponding package of src lik” D:\workspace_eclipse\heritrix2\src\org\archive\util”
7. copy “webapps”in heritrix-1.14.4.zip to the project root directory. Like” D:\workspace_eclipse\heritrix2\webapps”
If the folder name is not in the webapps need to make the appropriate changes Heritrix.java.
8. Configuration file changes, find the conf file under the heritrix.properties
// Set the user password
heritrix.cmdline.admin = admin:admin
// Set port
heritrix.cmdline.port = 8080
9. Jar works package on the introduction of the all the jar lib package following the introduction of engineering.
10. Org.archive.crawler.Heritrix.java found right in the project configuration options selected operating mode Classpath
Select User Entries - Advanced
Select Add Folders to add into the conf folder.
Click Start Run Run
05:22:32.875 EVENT Starting Jetty/4.2.23
05:22:32.937 WARN!! Delete existing temp dir C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\Jetty_127_0_0_1_8080__ for WebApplicationContext[/,jar:file:/D:/workspace/jcjcd/heritrixDemo/webapps/admin.war!/]
05:22:33.062 EVENT Started WebApplicationContext[/,Heritrix Console]
05:22:33.156 EVENT Started SocketListener on 127.0.0.1:8080
05:22:33.156 EVENT Started org.mortbay.jetty.Server@1f6f0bf
Heritrix version: @VERSION@
So far we have completed the configuration under Heritrix in Eclipse.
Now we can create a job for testing.
1. Http://127.0.0.1:8080 in your browser and enter the user input configuration file name password.
Two. Next, we create a job, select the navigation menu in the jobs, select CreateNewJob With defaults.
3. Were filled name, description, and to be crawling the url.
4. Select modules, here we will grab the results to create a mirror image, the default is compressed, Select Writers of org.archive.crawler.writer.ARCWriterProcessor remove and re-add a org.archive.crawler.writer.MirrorWriterProcessor
5. Select Setting bottom of the page set, many items can be set here, such as the maximum number of threads, timeout and so on.
There are two must be set
http-headers HTTP headers.
user-agent: Mozilla/5.0 (compatible; heritrix / @ VERSION @ + PROJECT_URL_HERE)
from: CONTACT_EMAIL_ADDRESS_HERE
I am here simply to replace @ VERSION @ heritrix version
PROJECT_URL_HERE local ip changed to http:// CONTACT_EMAIL_ADDRESS_HERE wrote a random email address above configuration is complete select submitjob.
6. To Console Click to start the beginning of the crawl job.
Crawl under the completed projects to see jobs in the folder can be found in the folder
文章分类:Java编程
To build Heritrix in Eclipse
This uses Heritrix 1.14.4 (2010 Year 5 dated 10 version is the latest version of the current situation)
1. First of all download from http://sourceforge.net/projects/archive-crawler/ heritrix-1.14.4.zip
heritrix-1.14.4-src.zip
2. In Eclipse create a java project in the works, respectively,
heritrix-1.14.4.zip
heritrix-1.14.4-src.zip to extract.
3.copy folder “com, org, st” in heritrix-1.14.4-src.zip to the src folder of the project. –“D:\workspace_eclipse\heritrix2\src”
4. copy the content of folder “src/conf/” in heritrix-1.14.4-src.zip to the src folder of the project.“D:\workspace_eclipse\heritrix2\src”
5. copy all .jar in the lib folder of heritrix-1.14.4-src.zip Unzip to the lib folder of project.
6.
copy “src / resources / org / archive / util in tlds-alpha-by-domain.txt “file in the lib folder of heritrix-1.14.4-src.zip Unzip to the corresponding package of src lik” D:\workspace_eclipse\heritrix2\src\org\archive\util”
7. copy “webapps”in heritrix-1.14.4.zip to the project root directory. Like” D:\workspace_eclipse\heritrix2\webapps”
If the folder name is not in the webapps need to make the appropriate changes Heritrix.java.
8. Configuration file changes, find the conf file under the heritrix.properties
// Set the user password
heritrix.cmdline.admin = admin:admin
// Set port
heritrix.cmdline.port = 8080
9. Jar works package on the introduction of the all the jar lib package following the introduction of engineering.
10. Org.archive.crawler.Heritrix.java found right in the project configuration options selected operating mode Classpath
Select User Entries - Advanced
Select Add Folders to add into the conf folder.
Click Start Run Run
05:22:32.875 EVENT Starting Jetty/4.2.23
05:22:32.937 WARN!! Delete existing temp dir C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\Jetty_127_0_0_1_8080__ for WebApplicationContext[/,jar:file:/D:/workspace/jcjcd/heritrixDemo/webapps/admin.war!/]
05:22:33.062 EVENT Started WebApplicationContext[/,Heritrix Console]
05:22:33.156 EVENT Started SocketListener on 127.0.0.1:8080
05:22:33.156 EVENT Started org.mortbay.jetty.Server@1f6f0bf
Heritrix version: @VERSION@
So far we have completed the configuration under Heritrix in Eclipse.
Now we can create a job for testing.
1. Http://127.0.0.1:8080 in your browser and enter the user input configuration file name password.
Two. Next, we create a job, select the navigation menu in the jobs, select CreateNewJob With defaults.
3. Were filled name, description, and to be crawling the url.
4. Select modules, here we will grab the results to create a mirror image, the default is compressed, Select Writers of org.archive.crawler.writer.ARCWriterProcessor remove and re-add a org.archive.crawler.writer.MirrorWriterProcessor
5. Select Setting bottom of the page set, many items can be set here, such as the maximum number of threads, timeout and so on.
There are two must be set
http-headers HTTP headers.
user-agent: Mozilla/5.0 (compatible; heritrix / @ VERSION @ + PROJECT_URL_HERE)
from: CONTACT_EMAIL_ADDRESS_HERE
I am here simply to replace @ VERSION @ heritrix version
PROJECT_URL_HERE local ip changed to http:// CONTACT_EMAIL_ADDRESS_HERE wrote a random email address above configuration is complete select submitjob.
6. To Console Click to start the beginning of the crawl job.
Crawl under the completed projects to see jobs in the folder can be found in the folder
相关文章推荐
- 今天遇到了一个问题,怎么判断你输入的数是整数 ----2012.6.28
- 今天遇到一个字符串 length出来的长度不对的问题
- 关于将一个字符串转换为整数的问题
- 最近在使用sprintf构造字符串时遇到的一个问题
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- CString字符串相加中遇到的一个问题
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- 今天遇到一个字符串 length出来的长度不对的问题
- shell脚本中字符串比较经常遇到的一个问题
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- 最近遇到了一道像俄罗斯方块的问题,A-D能对消,B-E能对消,C和F能对消。给你一个字符串“ADBECF”最后一定能对消,编写一个函数判断一个字符串能不能对消。
- STM32 液晶屏 显示字符串 中遇到的一个问题
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- 网易面试题之小易是一个数论爱好者,并且对于一个数的奇数约数十分感兴趣。一天小易遇到这样一个问题: 定义函数f(x)为x最大的奇数约数,x为正整数。 例如:f(44) = 11. 现在给出一个N,需要求
- 将一个字符串转换为一个整数,若遇到非数字字符则返回0
- php将一个字符串转变成键值对数组的效率问题
- 关于将一个字符串转换为整数的问题
- 做一个绕Y轴旋转的动画时遇到问题
- 关于《程序员编程宝典》中编写一个函数,作用是把一个char组成的字符串循环右移n位的问题