Wiki Link: [discussion:41067]
Powershell Transaltor via Web-based services  

Tags: Scripts

Nov 29 2008 at 5:24 AM
Edited Nov 29 2008 at 5:29 AM
The script below is a simple implementation of a in-shell translator function. It mainly involves a system.net.webclient object which is used to access web-based dictionary services (such as google dictionary). Default translation is set from English to Chinese Simplified.
It is tested and successful under powershell v1.
Extraction of information is based on a research into the return page tags. For Google dictionary, the tags <span class="mn"> ... </span> corresponds to the meaning of the word. A regular expression is responsible for the match.
Type
        Translate ?
for the whole list of languages supported.
However, this implementation still has some pitfalls.
First, the information returned is not complete. Although <span class="mn">...</span> contains most of useful results, it fails to extract samples and translations of samples (which bears the format of <samp>...<span class="translation">...</span>...</samp> tag). The reason why I haven't implemented this part yet is due to the difficulty in writing regex that extracts the text between a pair of tages with nested tags in it.
Second, it occurs sometimes when the returned information still contains fragmented tags (try this: translate "UK"). I think it is still due to the regex and my incomplete research into the tags.
This two problems can be alleviated by using xmldocument and getelementbyname, I think, given, however, the return page is well-formed (the answer is no).
Third, when the connection fails (such as typing unsuported language types), the returned exception error could not be erased. I think this can be handled by setting $ErrorActionPreference to be "SilentlyContinue", but I did not do that for the sake of not affecting the global variable.

A question still: How is $args variable is defined and used within a function? I am trying to get the argument count, but fail.
Yet another question: It comes that some Asian characters are not well displayed, such as Korean. Both Chinese Simplified and Traditional are OK. I am not sure whether it is due to my setting (language: English (United States), locale: PRC).


function Translate([string]$_literal = $null, [string]$_langFrom = "en", [string]$_langTo="zh-CN"){
if ($_literal -eq $null){ return $null;}
if ($_literal -eq "?"){
    $_hlp = @"
    Translate [literal] [lagFrom] [langTo]`n
    Language options: 
    English: en
    Chinese Simplified: zh-CN
    Chinese Traditional: zh-TW
    French: fr
    German: de
    Italian: it
    Korean: ko
    Portuguese: pt
    Russian: ru
    Spanish: es
    "@;
    write-host $_hlp;
    return $null;
}
$_tmp = [system.reflection.assembly]::loadwithpartialname("system.web");
$_url = "http://www.google.com/dictionary?langpair="+$_langFrom+"%7C"+$_langTo+"&q="+[system.web.httputility]::urlencode($_literal)+"&hl=en&aq=f";
$_wc = new-object -typename system.net.webclient;
$_wc.encoding = [system.text.encoding]::utf8;
$_rslt = $_wc.downloadstring($_url);
$_matchMN = new-object -typename System.Text.RegularExpressions.Regex -argument "<span class=`"mn`">((.|(\n))*?)</span>","IgnoreCase";
$_matchSpanB = new-object -typename System.Text.RegularExpressions.Regex -argument "<span class=`"mn`">","IgnoreCase";
$_matchSpanE = new-object -typename System.Text.RegularExpressions.Regex -argument "</span>","IgnoreCase";
$_al = new-object -typename system.collections.arraylist;
foreach ($_item in $_matchMN.matches($_rslt)){
    $_tm = $_al.add($_matchSpanE.replace($_matchSpanB.replace($_item.value,""),"").trim());
}
return $_al.toArray();
}

Updating...
© 2006-2009 Microsoft | About CodePlex | Privacy Statement | Terms of Use | Code of Conduct | Advertise With Us | Version 2009.10.27.15987