您的位置:首页 > 编程语言 > PHP开发

PHP截取字符串(加强版,兼容UTF8和GBK)

2012-10-31 14:51 435 查看
function htmlencode($string) {
if(is_array($string)) {
foreach($string as $key => $val) {
$string[$key] = htmlencode($val);
}
} else {
$string = preg_replace('/&((#(\d{3,5}|x[a-fA-F0-9]{4})|[a-zA-Z][a-z0-9]{2,5});)/', '&\\1', str_replace(array('&', '"', '<', '>'), array('&', '"', '<', '>'), $string));
}
return $string;
}

function cutStr($string, $length = 0, $dot = '') {
$string = strip_tags($string);
$string = trim($string);
$string = str_replace(array('&', '"', '<', '>'), array('&', '"', '<', '>'), $string);
$strlen = strlen(phpcharset($string, 'GBK'));
$charset = mb_detect_encoding($string, array('ASCII', 'UTF-8', 'GBK', 'GB2312', 'BIG5'));
if($length && ($strlen > $length)) {
$wordscut = '';
if(strtolower($charset) == 'utf-8') {
$n = 0;
$tn = 0;
$noc = 0;
while($n < strlen($string)) {
$t = ord($string[$n]);
if($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) {
$tn = 1;
$n++;
$noc++;
} elseif(194 <= $t && $t <= 223) {
$tn = 2;
$n += 2;
$noc += 2;
} elseif(224 <= $t && $t < 239) {
$tn = 3;
$n += 3;
$noc += 2;
} elseif(240 <= $t && $t <= 247) {
$tn = 4;
$n += 4;
$noc += 2;
} elseif(248 <= $t && $t <= 251) {
$tn = 5;
$n += 5;
$noc += 2;
} elseif($t == 252 || $t == 253) {
$tn = 6;
$n += 6;
$noc += 2;
} else {
$n++;
}
if($noc >= $length) {
break;
}
}
if($noc > $length) {
$n -= $tn;
}
$wordscut = substr($string, 0, $n);
} else {
for($i = 0; $i < $length - 1; $i++) {
if(ord($string[$i]) > 127) {
$wordscut .= $string[$i].$string[$i + 1];
$i++;
} else {
$wordscut .= $string[$i];
}
}
}
$wordscut .= $dot;
$string = $wordscut;
}
return htmlencode($string);
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: