您的位置：首页 > 运维架构 > 网站架构

对着电脑吼一声,自动打开谷歌网站或者自动打开命令行终端--使用google speech 语音识别程序操作电脑

2013-04-08 23:16 661 查看

1)更新源文件：@ubuntu:~$ sudo vim /etc/apt/sources.list

                deb http://cn.archive.ubuntu.com/ubuntu precise main restricted universe

2)更新源文件后，要 update：sudo aptitude update

3)我自己数据库有问题，重新安装mysql数据库：

                                    @ubuntu:~$ aptitude search mysql-server

                                    sudo aptitude reinstall mysql-server

                                  @ubuntu:~$ aptitude install mysql-server

                                  @ubuntu:~$ aptitude purge mysql-server

4)安装相关的包以及依赖环境:

sudo easy_install wave

~/pyvoice$ sudo aptitude install flac 此工具将 wav 转换成 flac

~/pyvoice$ sudo aptitude install python-alsaaudio

5)思路:

输入--处理--输出

1：获取电脑录音-->WAV文件

   python record wav

2：录音文件-->文本

   STT: Speech to Text

    STT API Google API

   TTS: Text to Speech

3:文本-->电脑命令

6)代码:

jiangge@ubuntu:~/pycode/pyvoice$ tree
.
├── 1.flac
├── 1.txt
├── 1.wav
├── recordtest.py
├── runcmd.py
└── stt_google.py

采样的库:recordtest.py

SpeechToText:stt_google.py

runcmd.py

1.txt

采样的库:recordtest.py

#!/usr/bin/env python

## recordtest.py
##
## This is an example of a simple sound capture script.
##
## The script opens an ALSA pcm forsound capture. Set
## various attributes of the capture, and reads in a loop,
## writing the data to standard out.
##
## To test it out do the following:
## python recordtest.py out.raw # talk to the microphone
## aplay -r 8000 -f S16_LE -c 1 out.raw

# Footnote: I'd normally use print instead of sys.std(out|err).write,
# but we're in the middle of the conversion between python 2 and 3
# and this code runs on both versions without conversion

import sys
import time
import getopt
import alsaaudio
import wave

def usage():
sys.stderr.write('usage: recordtest.py [-c <card>] <file>\n')
sys.exit(2)

if __name__ == '__main__':

card = 'default'

opts, args = getopt.getopt(sys.argv[1:], 'c:')
for o, a in opts:
if o == '-c':
card = a

if not args:
usage()

f = wave.open(args[0], 'wb')
f.setnchannels(1)
f.setsampwidth(2)
f.setframerate(8000)

# Open the device in nonblocking capture mode. The last argument could
# just as well have been zero for blocking mode. Then we could have
# left out the sleep call in the bottom of the loop
inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, card)

# Set attributes: Mono, 44100 Hz, 16 bit little endian samples
inp.setchannels(1)
inp.setrate(8000)
inp.setformat(alsaaudio.PCM_FORMAT_S16_LE)

# The period size controls the internal number of frames per period.
# The significance of this parameter is documented in the ALSA api.
# For our purposes, it is suficcient to know that reads from the device
# will return this many frames. Each frame being 2 bytes long.
# This means that the reads below will return either 320 bytes of data
# or 0 bytes of data. The latter is possible because we are in nonblocking
# mode.
inp.setperiodsize(160)

loops = 2000000
while loops > 0:
loops -= 1
# Read data from device
l, data = inp.read()

if l:
f.writeframes(data)
time.sleep(.001)

SpeechToText: stt_google.py

#coding=utf-8
import os
import urllib2
import urllib
import time
import json

def writetofile(list_data):
f = open("1.txt","w")
for n in list_data:
print n['utterance']
f.write(n['utterance'].encode("utf-8"))
f.close()

def stt_google_wav(filename):
#Convert to flac
os.system(FLAC_CONV+ filename+'.wav')
f = open(filename+'.flac','rb')
flac_cont = f.read()
f.close()

#post it
lang_code='zh-CN'
googl_speech_url = 'https://www.google.com.ua/speech-api/v1/recognize?xjerr=1&client=chromium&pfilter=2&lang=%s&maxresults=6'%(lang_code)
hrs = {"User-Agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7",'Content-type': 'audio/x-flac; rate=8000'}
req = urllib2.Request(googl_speech_url, data=flac_cont, headers=hrs)
p = urllib2.urlopen(req)
data =  p.read()
list_data = json.loads(data)["hypotheses"]
writetofile(list_data)
return

FLAC_CONV = 'flac -f ' # We need a WAV to FLAC converter.
if(__name__ == '__main__'):
stt_google_wav("1")

runcmd.py

#coding=utf-8
import os

f = open("1.txt","r")
cmds = f.read()
f.close()

def run_browser(cmds):
if cmds.find("谷歌") > -1:
os.system("firefox www.google.com")
if cmds.find("百度") > -1:
os.system("firefox www.baidu.com")
if cmds.find("新浪") > -1:
os.system("firefox www.sina.com.cn")

def run_term(cmds):
if cmds.find("终端") > -1:
os.system("gnome-terminal")

def run_gedit(cmds):
if cmds.find("编程") > -1:
os.system("gedit test.py")

def run_jeap(cmds):
count = 0
if cmds.find("智") > -1:
count += 1
if cmds.find("志") > -1:
count += 1
if cmds.find("只") > -1:
count += 1
if cmds.find("支") > -1:
count += 1
if cmds.find("普") > -1:
count += 1
if cmds.find("扑") > -1:
count += 1
if cmds.find("谱") > -1:
count += 1
if count > 1:
os.system("firefox www.jeapedu.com")

run_browser(cmds)
run_jeap(cmds)
run_term(cmds)
run_gedit(cmds)

7)运行:

jiangge@ubuntu:~/pycode/pyvoice$ python recordtest.py 1.wav;python stt_google.py ;python runcmd.py

然后对着电脑喊"新浪"

会看到命令行变化:

flac 1.2.1, Copyright (C) 2000,2001,2002,2003,2004,2005,2006,2007 Josh Coalson

flac comes with ABSOLUTELY NO WARRANTY. This is free software, and you are

welcome to redistribute it under certain conditions. Type `flac' for details.

1.wav: wrote 23004 bytes, ratio=0.621

新浪网

新浪吗

新浪

如果正常的话,你就会在浏览器上看到已经打开新浪网首页了.不正常的情况...嗯,Fuck GFW

----------------------------------------------------------------

参考文献:
http://uliweb.clkg.org/forum/3/210
代码作者为@hejiasheng

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： Python

相关文章推荐

新的分享

章节导航