Serval
Serval provides API access to multiple AI and Natural Language Processing (NLP) tools developed at SIL and beyond.
Overview
- An open source REST API that enables incremental machine translation and batch translation.
- Developers can access individual components and users can make use of reference UI integrations into Lynx and Scripture Forge.
- Statistical machine translation (SMT) supports incremental training and translation suggestions. Incremental training allows the model to learn on the fly as the user translates.
- Neural machine translation (NMT) for back translations into more than 200 languages based on the NLLB-200 model from Meta.
- Translation into new minority languages by fine-tuning the NLLB model.
Features
- Supports externally developed and hosted NLP engines
- File storage for uploading training and inference data for models
- Notification API to notify the calling application about the status of model training and inferencing
- Translation API for Paratext project files and text files
- Handles USFM, including versification, verse ranges, custom styles, etc.
- Generated translations can be returned as JSON or USFM
Future Development
The Serval team has identified several opportunities to improve and extend the usefulness of the APIs. The following features will be prioritized based on feedback from our partners and scheduled for development based on team bandwidth and resource availability:
- Integrating various techniques from industry to optimize the training and inferencing of NMT models (pruning, LoRA, FlashAttention)
- Supporting new NMT models, including MADLAD-400 and LLMs
- Interactive machine translation support using NMT models
- APIs for translation quality assessment, providing API access for Augmented Quality Assessment (AQuA) through Serval
- APIs for audio content, such as text-to-speech, speech-to-text, speech-to-text alignment, and speech translation