# How to run the codepython3 tokenizer_for_indian_languages_on_files.py --input input_folder --output output_folder --lang 2-digit ISO 639-1 codeinput_folder: Contains raw filesoutput_folder: Just give a name (no need to create a folder), an output_folder will be created where the tokenized outputs will be saved file wise in SSF formatlanguage: language codes, please see the list of codes for different languagesHindi: hiOriya/Odia: orManipuri: mnAssamese: asBengali: bnPunjabi: paUrdu: urEnglish: enGujarati: guMarathi: mrMalayalam: mlKannada: knTelugu: teTamil: taSample Run: (Run this code in your terminal)python3 tokenizer_for_indian_languages_on_files.py --input Sample-Input --output Sample-Output --lang kn