Linguistic Services

We offer over 25 years of experience and expertise developing and managing linguistic data to construct translation algorithms, test systems for Machine Translation (MT), and produce information extraction in multiple languages.  Natural Language Processing (NLP) in building MT, Automatic Speech Recognition systems (ASR), and Text to Speech (TTS).

Our Linguistic and Natural Language Processing (NLP) Services Include:

Data Refinement and Processing

  • Collect text and document image data from websites, printed material, handwritten material, and speech data from speakers reading prepared texts in multiple languages for use in testing and training language models
  • Edit, translate, align, linguistically annotate, organize and archive linguistic data
  • Provide quality assurance for multilingual data
  • Maintain, update, refine, transcribe, and describe linguistic data collected and archived

Software Engineering, Natural Language Processing (NLP), Integration and Test and Evaluation

  • Integrate linguistic software components for MT, Text to Speech (TTS), and Speech Recognition
  • Design and develop custom interface to client linguistic research and applications
  • Conduct test and evaluation of client multilingual processing components such as: speech recognition, document image processing, and automatic text alignment
  • Deploy automatic Speech to Speech Translation systems within six (6) months of an initial language request
  • Train new Speech Recognition (ASR), Machine Translation (MT), and Text To Speech (TTS) models using neural network machine learning
  • Build Graphical User Interface (GUI) for multiple operating systems including Linux, Windows and Android OS for Speech To Speech (S2S) systems, stream transcription services, and task driven data collection applications
  • Maintain and train models for multiplatform support for Android, Linux, and Windows systems

Computational Linguistics Research

  • Develop computational linguistic methods for constructing hybrid machine translation systems
  • Use selected software products to investigate, develop and refine research tools
  • Develop hypotheses about performance of hybrid machine translation systems that are grounded in current linguistic and computer science theory and research
  • Develop linguistically-motivated test based on knowledge of syntax, morphology, phonology and linguistic theoretical principles
  • Working with Large Language Models (LLMs) and research updated computational techniques and methods
  • Data augmentation by using the output of a model(s) in training other model(s)