This diacritiser was trained on a dataset of millions of datapoints scraped from websites and from old books (Alshamila library). It uses Conditional Random Fields and achieves higher accuracy than all, bigger size, slower diacritisers out there.
It can also run on small devices (mobile phones).
Please contact Nawar Halabi to obtain a license for this diacritiser.
If you found some errors in the automatically generated diacritisation, or you want to contribute some diacritised text, please feel free to correct/fill the text area below. This helps us improve performance, and later, we will publish the clean collected data.