"# ** Hybrid Neural Machine Translation for HimangiY **\n",
"#### Vandan Mujadia, Dipti Misra Sharma\n",
"#### LTRC, IIIT-Hyderabad, Hyderabad"
],
"metadata": {
"id": "axRPPFFocszE"
}
},
{
"cell_type": "markdown",
"source": [
"This demonstrates how to train a sequence-to-sequence (seq2seq) model for Kannada-to-Hindi translation **roughly** based on [Effective Approaches to Attention-based Neural Machine Translation](https://arxiv.org/abs/1706.03762) (Vaswani, Ashish et al).\n",
"\n",
"## An Example to Understand sequence to Sequence processing using Transformar Network.\n",
"\n",
"<img src=\"https://www.tensorflow.org/images/tutorials/transformer/apply_the_transformer_to_machine_translation.gif\" alt=\"Applying the Transformer to machine translation\">\n",
"\n",
"Source: [Google AI Blog](https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html)\n",
"\n"
],
"metadata": {
"id": "aX4IVQ5qf52I"
}
},
{
"cell_type": "markdown",
"source": [
"## Applying the Transformer to machine translation.\n",
" Found existing installation: torch 2.0.1+cu118\n",
" Uninstalling torch-2.0.1+cu118:\n",
" Successfully uninstalled torch-2.0.1+cu118\n",
" Attempting uninstall: torchtext\n",
" Found existing installation: torchtext 0.15.2\n",
" Uninstalling torchtext-0.15.2:\n",
" Successfully uninstalled torchtext-0.15.2\n",
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
"torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 1.11.0 which is incompatible.\n",
"torchdata 0.6.1 requires torch==2.0.1, but you have torch 1.11.0 which is incompatible.\n",
"torchvision 0.15.2+cu118 requires torch==2.0.1, but you have torch 1.11.0 which is incompatible.\u001b[0m\u001b[31m\n",
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
"Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from IL-Tokenizer==0.0.2) (6.0.1)\n",
"Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from IL-Tokenizer==0.0.2) (2.31.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->IL-Tokenizer==0.0.2) (3.2.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->IL-Tokenizer==0.0.2) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->IL-Tokenizer==0.0.2) (2.0.4)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->IL-Tokenizer==0.0.2) (2023.7.22)\n",
"Building wheels for collected packages: IL-Tokenizer\n",
" Building wheel for IL-Tokenizer (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
" Created wheel for IL-Tokenizer: filename=IL_Tokenizer-0.0.2-py3-none-any.whl size=7225 sha256=1bec4df8b3d0a8ca3a48367f72deb4b2d68623a782699f27934083cfbaa6b959\n",
" Stored in directory: /tmp/pip-ephem-wheel-cache-624d680m/wheels/9a/fb/5b/3d75bfde8561726121c09f0f0a83389c05312df8a513808c41\n",
"Successfully built IL-Tokenizer\n",
"Installing collected packages: IL-Tokenizer\n",
"Successfully installed IL-Tokenizer-0.0.2\n",
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
"Requirement already satisfied: torch==1.13.1 in /usr/local/lib/python3.10/dist-packages (1.13.1)\n",
"Requirement already satisfied: torchvision==0.14.1 in /usr/local/lib/python3.10/dist-packages (0.14.1)\n",
"Requirement already satisfied: torchaudio==0.13.1 in /usr/local/lib/python3.10/dist-packages (0.13.1)\n",
"Collecting configargparse\n",
" Obtaining dependency information for configargparse from https://files.pythonhosted.org/packages/6f/b3/b4ac838711fd74a2b4e6f746703cf9dd2cf5462d17dac07e349234e21b97/ConfigArgParse-1.7-py3-none-any.whl.metadata\n",
"\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n",
# ** Hybrid Neural Machine Translation for HimangiY **
#### Vandan Mujadia, Dipti Misra Sharma
#### LTRC, IIIT-Hyderabad, Hyderabad
This demonstrates how to train a sequence-to-sequence (seq2seq) model for Kannada-to-Hindi translation **roughly** based on [Effective Approaches to Attention-based Neural Machine Translation](https://arxiv.org/abs/1706.03762) (Vaswani, Ashish et al).
## An Example to Understand sequence to Sequence processing using Transformar Network.
<img src="https://www.tensorflow.org/images/tutorials/transformer/apply_the_transformer_to_machine_translation.gif" alt="Applying the Transformer to machine translation">
Source: [Google AI Blog](https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html)
## Applying the Transformer to machine translation.