README 2.1 KB
Newer Older
priyank's avatar
priyank committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
convertor-indic : 

convertor is used for converting the input file into utf format or in wx 
format.The fields which are converting is TKN,lex,vib,name and head in SSF.
If the input file is in wx format the program converts only the words which 
are in wx format irrespective of them starting with or without "@" symbol 
and if the inputfile is in utf format the program converts only the words 
which are in the source language mentioned by the user irrespective of them 
starting with or without "@" symbol. The converter will not touch any  third
language( other than source language) Unicode character .( i.e) while 
processing. The input file must be in SSF or TEXT format.

How to use ??

1) convertor-indic (wx2utf or utf2wx)

perl convertor_indic.pl --format=[ssf|text] --lang=[hin|tel|..] --src_encoding=[utf|wx] \
--tgt_encoding=[wx|utf] --input=<input-file>

e.g.

a) UTF to WX for Hindi (SSF format)

perl convertor_indic.pl -f=ssf -l=hin -s=utf -t=wx -i=tests/hin/ssf/test_case_3_utf.in

output will be printed to STDOUT

b) WX to UTF fo Hindi (SSF format)

perl convertor_indic.pl -f=ssf -l=hin -s=wx -t=utf -i=tests/hin/ssf/test_case_3_wx.in

output will be printed to STDOUT

c) UTF to WX for Hindi (TEXT format)

perl convertor_indic.pl -f=text -l=hin -s=utf -t=wx -i=tests/hin/text/sample_story_utf.in

d) WX to UTF fo Hindi (TEXT format)

perl convertor_indic.pl -f=text -l=hin -s=wx -t=utf -i=tests/hin/text/sample_story_wx.in

output will be printed to STDOUT

e) more information check

perl convertor_indic.pl --help


Directory Structure:

convertor-indic
     |
     |---lib (source code of the convertor library)
     |
     |---ssfapi (SSF API's)
     |
     |---tests (contains the referenece input and output)
     |
     |---doc (documentaion)
     |
     |---extra-files (some backup/extra files)
     |
     |---convertor_indic.pl (main file)
     |
     |---wx2utf.pl (conversion wx2utf file)
     |
     |---utf2wx.pl (conversion utf2wx file)
     |
     |---README (How to run/use the module)
     |
     |---ChangeLog (version information)



Contact :
Rashid Ahmad
Expert Software Ltd.
rashid101b@gmail.com